mlu5 for multiple transcripts

28 views
Skip to first unread message

Elma Blom

unread,
Oct 10, 2016, 10:34:43 AM10/10/16
to chibolts
Hi, 

We are using the mlu5 command without %mor tier. This goes well if we run the following command for one child at a time:

maxwd +t*CHI +g2 +c5 +d1 -t%mor *.cha | mlu -t%mor

However, if we select the transcripts of multiple children, mlu is calculated over all transcripts combined and not for each child/transcript separately (so we end up with one output file). If I understand the manual correctly, having a batchfile does not really solve the issue because we would still need to specify all transcripts, which are hundreds.

Best wishes, Elma

Leonid Spektor

unread,
Oct 10, 2016, 3:23:54 PM10/10/16
to chib...@googlegroups.com

Elma,

    I assume you had a question. If not I apologies for replying. If you did, then the reason all files are combined into one is because you are using a pipe '|'. Pipes combine all files into one output and MLU only sees that one input file. In vast majority of cases pipes should not be used at all. The correct command lines are:

maxwd +t*CHI +g2 +c5 +d1 -t%mor +f *.cha
mlu -t%mor *.mxwrd.cex
Leonid.

--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To post to this group, send email to chib...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/9c8fb5e5-0e6e-4a25-8adf-af540098e5b7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Elma Blom

unread,
Oct 10, 2016, 4:54:27 PM10/10/16
to chib...@googlegroups.com
Dear Leonid,

your assumption is entirely correct, and the command works perfectly now. Thanks for your swift response and explanation.

Best wishes, Elma

2016-10-10 21:23 GMT+02:00 Leonid Spektor <spe...@andrew.cmu.edu>:

Elma,

    I assume you had a question. If not I apologies for replying. If you did, then the reason all files are combined into one is because you are using a pipe '|'. Pipes combine all files into one output and MLU only sees that one input file. In vast majority of cases pipes should not be used at all. The correct command lines are:

maxwd +t*CHI +g2 +c5 +d1 -t%mor +f *.cha
mlu -t%mor *.mxwrd.cex
Leonid.

On 10-10-16 10:34, Elma Blom wrote:
Hi, 

We are using the mlu5 command without %mor tier. This goes well if we run the following command for one child at a time:

maxwd +t*CHI +g2 +c5 +d1 -t%mor *.cha | mlu -t%mor

However, if we select the transcripts of multiple children, mlu is calculated over all transcripts combined and not for each child/transcript separately (so we end up with one output file). If I understand the manual correctly, having a batchfile does not really solve the issue because we would still need to specify all transcripts, which are hundreds.

Best wishes, Elma
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe@googlegroups.com.

To post to this group, send email to chib...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/9c8fb5e5-0e6e-4a25-8adf-af540098e5b7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "chibolts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/chibolts/32WHk440sgg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to chibolts+unsubscribe@googlegroups.com.

To post to this group, send email to chib...@googlegroups.com.

bwv...@gmail.com

unread,
Feb 2, 2018, 1:10:14 PM2/2/18
to chibolts
Hi Leonid

My question differs from Elma's but relates to the subject line so I am posting it here. 

I want to get MLU based on the longest five utterances (MLU5) for 33 separate files. I can do this following the tutorial in the manual for one file. But for my own analyses, I have multiple files that I want to get one output file for (containing the 33 results - not an average). Is there a way to do this?

This is what I thought would work but doesn't:
maxwd +g1 +c5 +d1 +tCHI +o%mor –s"[+ bch]" –s+\” @(I enter my files here by clicking File In) | mlu

The above command brings up error messages about %mor tiers not being associated with speakers - which I don't think is an issue since it runs fine when I do it for an individual file. 

So then I created a batch file, as outlined in the tutorial. But that brings up an error message saying "Can't open output file". Even if I was doing this correctly and it worked, I actually don't want 33 separate files for each output - I'd like them all in one output file (but not averaged). 

Any help would be appreciated, 

Thanks
V

bwv...@gmail.com

unread,
Feb 2, 2018, 2:12:22 PM2/2/18
to chibolts
maxwd +tCHI +g1 +d1 +c5 –s"[+ bch]" –s+\”  @ | mlu > mlu5

this gets the output to one file, but again collapsing the data across all .cha files rather than giving it for each file. 

Any help would be appreciated here. 

Leonid Spektor

unread,
Feb 2, 2018, 2:20:42 PM2/2/18
to chib...@googlegroups.com
V,

The suggestions below have been tested on the latest version of CLAN only. I don't know what version your have and so these commands might not work with your version of CLAN.

First, of all since you are using +g1 option with MAXWD you need to make sure that you data file have %mor tier. Otherwise, the morpheme count will be incorrect.

Second, if you want to get separate outputs for multiple files, then you should not use pipe (|) to send output from MAXWD to MLU

Here are commands that I would recommend:

maxwd +g1 +c5 +d1 +t*CHI –s"[+ bch]" +f (your file names)

mlu *.mxwrd.cex > combined.cex

OR

mlu +d *.mxwrd.cex

The last command will give you a warning that @ID tier was not found, but that is okay.


Leonid.

Victoria Ward

unread,
Feb 2, 2018, 5:01:05 PM2/2/18
to chibolts
This has worked - thanks very much! 

Vicky
Reply all
Reply to author
Forward
0 new messages