Switches in FLO (+u or >>, +c1) and making a list of speaker IDs

21 views
Skip to first unread message

Amanda Owen Van Horne

unread,
Aug 14, 2013, 9:41:26 AM8/14/13
to chib...@googlegroups.com
Hi,
  I'm cleaning codes out of some CHAT files to prepare them for use in another program - I really want one gigantic file full of adult input to children.

For the most part I've been relying on the FLO command to create a new file of just the adult input (
flo @ -t*CHI -t% +d +r1 +f). 

  I'd like to do a few things to improve workflow that have me stumped:

1) eliminate 'empty' utterances (*MOT: 0.).  FLO changes it to *MOT: .  <-- +c1 seems like it might do this in KWAL (9.17.3 in the manual) but +c doesn't do the same thing with every command and doesn't seem available with FLO. Do I need to do a first pass with KWAL and pipe the results into FLO?  Or vice versa?

2) Make a list of all speaker identifiers across a series of corpus files  (*MOT:, *FAT:, *URS:, etc)

3) Merge all of the output into a single file.  FLO
@ -t*CHI -t% +d >>myanalyses  doesn't work nor does +u.  The first generates an empty file called myanalyses plus lots of individual files while the second yields an error message. Again should I be combinging FLO with something else to get the output I want.

I've been experimenting with the BATES corpus for now but hope to use more if I can sort out the above issues.

Thanks so much for any guidance you can provide.

Amanda



Amanda J. Owen Van Horne PHD CCC-SLP
Department of Communication Sciences and Disorders
University of Iowa
ajo...@gmail.com
Amanda J. Owen Van Horne
ajo...@gmail.com



Brian MacWhinney

unread,
Aug 14, 2013, 10:06:15 AM8/14/13
to chib...@googlegroups.com
On Aug 14, 2013, at 9:41 AM, Amanda Owen Van Horne <ajo...@gmail.com> wrote:

Hi,
  I'm cleaning codes out of some CHAT files to prepare them for use in another program - I really want one gigantic file full of adult input to children.

For the most part I've been relying on the FLO command to create a new file of just the adult input (
flo @ -t*CHI -t% +d +r1 +f). 

I would use KWAL for this, rather than FLO.  Also, you can use TRIM, which simplifies some of the typing in the KWAL command.


  I'd like to do a few things to improve workflow that have me stumped:

1) eliminate 'empty' utterances (*MOT: 0.).  FLO changes it to *MOT: .  <-- +c1 seems like it might do this in KWAL (9.17.3 in the manual) but +c doesn't do the same thing with every command and doesn't seem available with FLO. Do I need to do a first pass with KWAL and pipe the results into FLO?  Or vice versa?


For detailed and/or complex custom removals and changes of this type, I would recommend using a Regular Expression editor such as BBEdit (Mac) or Sublime Edit (PC).

2) Make a list of all speaker identifiers across a series of corpus files  (*MOT:, *FAT:, *URS:, etc)

freq +s"\**" *.cha +re +y +u


3) Merge all of the output into a single file.  FLO
@ -t*CHI -t% +d >>myanalyses  doesn't work nor does +u.  The first generates an empty file called myanalyses plus lots of individual files while the second yields an error message. Again should I be combinging FLO with something else to get the output I want.

We designed KWAL, rather than FLO  for what you are doing.  KWAL would seem to be a better match.  You could always fun FLO at the very end, if needed.  

--Brian MacWhinney

I've been experimenting with the BATES corpus for now but hope to use more if I can sort out the above issues.

Thanks so much for any guidance you can provide.

Amanda



Amanda J. Owen Van Horne PHD CCC-SLP
Department of Communication Sciences and Disorders
University of Iowa
ajo...@gmail.com
Amanda J. Owen Van Horne
ajo...@gmail.com




--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To post to this group, send email to chib...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CA%2BUfwo5HtQcYMOd68x0KD5Cz%3D1VoRm_br-82G70p8vzZ9dugjg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Amanda Owen Van Horne

unread,
Aug 14, 2013, 10:21:51 AM8/14/13
to chib...@googlegroups.com
Thanks so much. FLO did such a good job of cleaning out retracings and () in funny places that I was using it. I'll go back and start over with KWAL and then, as you suggest, run FLO at the end. Thanks too for the command to extract a list of all speakers.  I had tried several very wrong ways and never got close.  That's super helpful!

Amanda

Amanda Owen Van Horne

unread,
Aug 14, 2013, 10:29:48 AM8/14/13
to chib...@googlegroups.com
So I just tried to run a similar analysis using KWAL and the +c switch doesn't seem to work in KWAL either (I both typed in my own command and copied and pasted the example for extracting speech only from 9.17.2.   Does +c not work anymore?

Amanda

Brian MacWhinney

unread,
Aug 14, 2013, 10:37:23 AM8/14/13
to chib...@googlegroups.com
Dear Amanda,
    KWAL never had a +c switch.  I don't believe that the +c switch for FLO was designed to remove empty utterances.  As I suggested earlier, for such specialized removals, you would want to use something like BBEdit or Sublime Edit. I would do this before a final run of FLO.

--Brian
 
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To post to this group, send email to chib...@googlegroups.com.

Amanda Owen Van Horne

unread,
Aug 14, 2013, 10:39:52 AM8/14/13
to chib...@googlegroups.com
Sorry for the confusion.  I was following the instructions on page 110 of the manual
"+c Select speaker tier only if has at least Nwords or 0 for no words at all" 
I'll experiment with another method.

Amanda

Amanda J. Owen Van Horne

--
You received this message because you are subscribed to a topic in the Google Groups "chibolts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/chibolts/LeqKLlZFptw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to chibolts+u...@googlegroups.com.

To post to this group, send email to chib...@googlegroups.com.

Brian MacWhinney

unread,
Aug 14, 2013, 11:12:52 AM8/14/13
to chib...@googlegroups.com
Dear Amanda,

    Thanks for spotting this mistake in the manual.  I see now that we changed +c to +xcN several months ago and that
this change was not yet reflected in the manual.  (It is just now.)  However, if you just type "kwal" you will see this description of the +xcN
switch and the related +xS switch:

+xCN: include only utterances which are C (>, <, =) than N items (w, c, m), "+x=0w" for zero words
+xS: specify items to include in above count (Example: +xxxx +xyyy)
-xS: specify items to exclude from above count

This is only for KWAL, not FLO.  So you would still want to use KWAL first and then FLO.
Please give this a try.  Hopefully, it can save you from relying on BBEdit in this case.

--Brian MacWhinney

Leonid Spektor

unread,
Aug 14, 2013, 11:48:42 AM8/14/13
to chib...@googlegroups.com
Amanda,

Currently version FLO on our server does not have +x option like KWAL does, but +x option has been added to FLO and new CLAN will be release today. The redirect option ">>myanalyses" did not work for you because FLO sends output to a file by default. If you add "-f" option to your command "FLO @ -t*CHI -t% +d  -f >>myanalyses", then it will work as you expected.

By default the +x option does not count 0 or "xxx" or "yyy" or "www" as words, so any utterance that consists only of those symbols is counted as having zero words. If you want to exclude only utterances that have "0.", but not to exclude utterances that only have "xxx" or "yyy" or "www", then you will need to add +x"xxx", +x"yyy" and +x"www" options to the KWAL command line. Your command will be:

kwal +d +x>0w +x"xxx" +x"yyy" +x"www"


Leonid.



Amanda Owen Van Horne

unread,
Aug 14, 2013, 12:03:55 PM8/14/13
to chib...@googlegroups.com
That worked! Thanks! 


On Wednesday, August 14, 2013 8:41:26 AM UTC-5, Amanda Owen Van Horne wrote:
Reply all
Reply to author
Forward
0 new messages