how to use the cut file to include and exclude words with freq

11 views
Skip to first unread message

Janet Bang

unread,
Dec 13, 2018, 7:40:37 PM12/13/18
to chib...@googlegroups.com

Hello,


We have questions about how to use cut files with freq. 


We have a list of words we want to exclude from our freq count, and other times include in our freq count. 


When we exclude, our command is the following below. With this command we would like a count of lemmas on the %mor line, excluding utterances with the english precode, and excluding a list of additional english words in our cut file. This code appears to work fine on our test files.


freq +f +u +o3 +sm;*,o% -s”[- eng]” -...@english.cut @


english.cut file set up:

co|please

co|thank_you



However, in another command we would like to get a freq count on lemmas for only those utterances with [- eng]  AND including the same words in the cut file (This would give us a count of all english words in the file, those in english-only utterances and single words). We've tried the following command, but this only provides lemmas on the [- eng] lines and does not include the words in the cut file. 


freq +f +u +o3 +sm;*,o% +s”[- eng]” +s@english.cut @


Does the command or cut file need to written differently if I want to now include those words in my freq count? We know that another option could be to tag all the words in the cut file with an @s and use the following command below, but we were trying out other possibilities.


freq +u +o3 +f +l +s*@s:eng @


Thank you in advance,

Janet


--

Janet Y. Bang, Ph.D.

Postdoctoral Fellow

Department of Psychology

Stanford University


jb...@stanford.edu





Leonid Spektor

unread,
Dec 14, 2018, 12:02:34 AM12/14/18
to chib...@googlegroups.com
Janet,

The problem with your '+s”[- eng]” +s...@english.cut' command is that the +s”[- eng]” option tell FREQ to only look on utterances that have "[- eng]" pre-code. So if words from "english.cut" file are not present on utterances with "[- eng]" pre-code, then those word will not be found.

The command "freq +u +o3 +f +l +s*@s:eng" is the right choice. If you want lemmas, then you need to add "+sm;*,o% +d7" options. Try command:

freq +u +o3 +f +l +sm;*,o% +s*@s:eng +d7

If this doesn't work, then please email to me directly some sample file of your data to see what FREQ has to work with.


Leonid.

On Dec 13, 2018, at 19:40, Janet Bang <jb...@stanford.edu> wrote:

Hello,

We have questions about how to use cut files with freq. 

We have a list of words we want to exclude from our freq count, and other times include in our freq count. 

When we exclude, our command is the following below. With this command we would like a count of lemmas on the %mor line, excluding utterances with the english precode, and excluding a list of additional english words in our cut file. This code appears to work fine on our test files.

freq +f +u +o3 +sm;*,o% -s”[- eng]” -s...@english.cut @

english.cut file set up:
co|please
co|thank_you


However, in another command we would like to get a freq count on lemmas for only those utterances with [- eng]  AND including the same words in the cut file (This would give us a count of all english words in the file, those in english-only utterances and single words). We've tried the following command, but this only provides lemmas on the [- eng] lines and does not include the words in the cut file. 

freq +f +u +o3 +sm;*,o% +s”[- eng]” +s@english.cut @

Does the command or cut file need to written differently if I want to now include those words in my freq count? We know that another option could be to tag all the words in the cut file with an @s and use the following command below, but we were trying out other possibilities.

freq +u +o3 +f +l +s*@s:eng @

Thank you in advance,
Janet

--
Janet Y. Bang, Ph.D.
Postdoctoral Fellow
Department of Psychology
Stanford University





-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To post to this group, send email to chib...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/DM5PR02MB3275036A05A1DD8F6818C4AAD7A10%40DM5PR02MB3275.namprd02.prod.outlook.com.
For more options, visit https://groups.google.com/d/optout.

Janet Bang

unread,
Dec 14, 2018, 3:51:56 PM12/14/18
to chib...@googlegroups.com
Hi Leonid,

Thank you for the clarification and the additional code for the lemmas! That should work for us. 

Janet



Reply all
Reply to author
Forward
0 new messages