combining output to excel

101 views
Skip to first unread message

Brian Verdine

unread,
Feb 29, 2016, 3:17:59 PM2/29/16
to chibolts
Hi,

I am trying to figure out if there is a simple way to run and compile data from multiple freq commands.  

Most of the transcripts I have contain only 2 speakers (a child and a parent).  However, about 1/3 also have a toy (an iPad) that talks and that we transcribed.  I am trying to count the types, tokens, and get TTR for different groups of words for these 3 speaker tiers individually.  In addition, for those that heard the iPad, I need to produce type/token stats for a combination of the parent and iPad.  By getting that data we will be able to look at all of the speech a child hears (whether from parent or iPad).  These pieces of data are generated into excel files by the 4 commands below.  They seem to be generating the data I need into excel files in (generally) the format I need.

Child:
Freq +d5 +d2 +t@ID=”*|Target_Child|*” +...@CutFileAllCodes.cut +fCHI_AllCodes *.cha
Parent:
Freq +d5 +d2 +t@ID=”*|Target_Adult|*” +...@CutFileAllCodes.cut +fPAR_AllCodes *.cha
iPad:
Freq +d5 +d2 +t@ID=”*|Toy|*” +...@CutFileAllCodes.cut +fIPA_AllCodes *.cha
Parent and iPad:
Freq +d5 +d2 +t@ID=”*|Target_Adult|*” +t@ID=”*|Toy|*” +o3 +...@CutFileAllCodes.cut +fPARIPA_AllCodes *.cha

However, I have three problems I am hoping to solve.  I am repeating this set of 4 freq analyses for 7 groups of codes/words (which will replace the cut file referenced in +s).  So I will have at least 28 commands and 28 excel files I will need to combine into a single database for analysis.  I would love to automate this data generation and combination as much as possible.


1)  Every time I run these commands separately they work fine.  If I try to run them as a batch by using the attached batch file (command "batch BatchCommands.cex") I get an output that says "Using search file: C:\talkbank\clan\work\CutFileAllCodes.cut" was found.  CAN'T FIND ANY DATA TIERS IN ANY OF INPUT FILES PLEASE PROVIDE A SPECIFIC SPEAKER WITH +t OPTION"  The batch commands results in a repeat of this same warning 4 times and produces nothing.  The batch file is in the "work" directory as well as the "CutFileAllCodes.cut" file and all of the chat files.  Not sure why it does not seem to be able to find these.  It would be nice to set these up in a file and just run them all at once.  Plus it will make the stats more easily reproducible and better documented.


2)  Ultimately I would like one database with a single line for each transcription and the counts/stats generated from each of my commands in rows going across.  Is there a way to have CLAN automatically append data and match the data into a single row for each file as each additional command is run?  


3)  If there is nothing I can do for number 2, then it would be really nice if I could solve another problem.  Since the iPad codes are only in about 1/3 of the files, when the excel sheet is produced for the iPad commands, it doesn't produce a row for every analyzed file.  Therefore, the excel file is "shorter" than the rest and when I go to paste this data in with the other data, I have to match the iPad data for each participant with the data output from the rest of the files.  If I could have CLAN output 0's/periods/blanks for the files that are analyzed I can write a very simple copy/paste macro to combine the excel files.  Is there a way to tell CLAN to produce output for every analyzed file even if the tier is missing?


Thanks for any help!

Brian



BatchCommands.cex

Leonid Spektor

unread,
Feb 29, 2016, 4:40:50 PM2/29/16
to chib...@googlegroups.com
Brian.

    I can't answer your question 1). I have created my own batch file and when I ran it I did not get any error messages.  I ran it on both Mac and Windows 8.1 PC. If you could email me directly some *.cha files you use as input and the CutFileAllCodes.cut file, then I will have better chance of replicating the problem.

2).  When FREQ creates a SPREADSHEET it does not append data to existing output files and certainly it can't add it to the end of each row of each input file's output. Your output consists of only four Excel files, so it should be easy to append each consecutive FREQ file to the right of the previous FREQ file's output by hand using Excel application. All rows for each FREQ command's output Excel file should align automatically since, every input file will have exactly one corresponding row in Excel file and every FREQ command will have exactly the same number of rows. You said that some of your input CHAT files do not have Toy/IPA speaker, but this can be fixed by creating @ID header for Toy speaker in all files and in files that currently do not have any Toy transcription you would just create a one dummy tier for that speaker. For example, if "Toy" speaker code is "IPA", that just add one dummy tier "*IPA:    0." to files that do not have IPA speakers now. Otherwise, FREQ does not create output for files that do not have speakers specified on command line. The other problem I see you have is command:


Freq +d5 +d2 +t@ID=”*|Target_Adult|*” +t@ID=”*|Toy|*” +o3 +...@CutFileAllCodes.cut +fPARIPA_AllCodes *.cha

This command will produce two rows in Excel for every input file that has both "Target_Adult” and "Toy". If you want FREQ to combine results of those two speakers into one row, then add +o3 option to the FREQ command line.


If I misunderstood what you are trying to accomplish, then please give me directly a more specific description and some examples would help a lot too.
   
Leonid.

--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To post to this group, send email to chib...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/f739fe4d-d3f2-4a52-9f7a-95225a4e3a68%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Brian Verdine

unread,
Mar 1, 2016, 9:33:55 AM3/1/16
to chibolts
Hi Leonid,

My responses are in line below.

Thanks!
Brian


On Monday, February 29, 2016 at 4:40:50 PM UTC-5, Spektor, Leonid: CMU wrote:
Brian.

    I can't answer your question 1). I have created my own batch file and when I ran it I did not get any error messages.  I ran it on both Mac and Windows 8.1 PC. If you could email me directly some *.cha files you use as input and the CutFileAllCodes.cut file, then I will have better chance of replicating the problem.

I will send some immediately after posting.  As I said, the commands work when I plug them individually into the command window.  Seems weird that it could be either my chat files or the cut file, but anything is possible I suppose... I assumed maybe there was something wrong with the formatting of the batch file or the way I called it (I literally type in "batch BatchCommands.cex" into the command window).  Nonetheless, this is a problem I would definitely like to figure out.
 
2).  When FREQ creates a SPREADSHEET it does not append data to existing output files and certainly it can't add it to the end of each row of each input file's output. Your output consists of only four Excel files, so it should be easy to append each consecutive FREQ file to the right of the previous FREQ file's output by hand using Excel application.

My output from the example commands I posted is only 4 spreadsheets, but I will be running these commands with at least 7 groups of words (so 28+ spreadsheets) and inevitably there will be other groups of words/codes we will want to export.  This still isn't a huge problem, but if clan had a way to automatically combine them I wanted to make sure I was doing the easiest thing.
 
All rows for each FREQ command's output Excel file should align automatically since, every input file will have exactly one corresponding row in Excel file and every FREQ command will have exactly the same number of rows. You said that some of your input CHAT files do not have Toy/IPA speaker, but this can be fixed by creating @ID header for Toy speaker in all files and in files that currently do not have any Toy transcription you would just create a one dummy tier for that speaker. For example, if "Toy" speaker code is "IPA", that just add one dummy tier "*IPA:    0." to files that do not have IPA speakers now.  Otherwise, FREQ does not create output for files that do not have speakers specified on command line. 

Adding an IPA line should drastically help with combining them and, in fact, should make it really easy to write a copy/paste macro for the excel files that will do it in a few seconds.  Since I created the chat files from excel files to begin with, they actually all already have the @ID header for toy.  Just need to add the one dummy line.  Thanks for the idea.  I think this will take care of my biggest problem.
 
The other problem I see you have is command:

Freq +d5 +d2 +t@ID=”*|Target_Adult|*” +t@ID=”*|Toy|*” +o3 +...@CutFileAllCodes.cut +fPARIPA_AllCodes *.cha

This command will produce two rows in Excel for every input file that has both "Target_Adult” and "Toy". If you want FREQ to combine results of those two speakers into one row, then add +o3 option to the FREQ command line.

Maybe I'm misunderstanding, but the +o3 is after "Toy" in the command and seems to be combining just fine when I copy/paste the command by itself into the command window.
 

If I misunderstood what you are trying to accomplish, then please give me directly a more specific description and some examples would help a lot too.

I think you mostly interpreted correctly. If I can get the batch running issue fixed I think I'll be back in business.
Reply all
Reply to author
Forward
0 new messages