Stephanie,
If you want to analyze data from our server, then we have many data choices that already have been tagged with MOR grammar. Our data is located on one of two servers at URLs:
If you look at "
http://childes.talkbank.org/data/" web page, you will see data names with "-MOR" string. This data has MOR tags. You just need to download it and run FREQ commands to compute frequency of nouns and verbs. I will give an example of the FREQ commands later. First you need to decide which words exactly do you consider to be nouns and verbs. To give you better explanation I would recommend that you download English or your choice language grammar from our server at URL:
I will use English data as an example, because you did not specify which language you are interested in. After you download MOR grammar from web link above you will unzip it. In case of English grammar you will get "eng" folder and move it to hard disk to preferably "CLAN" folder. If you are using Mac, then it will go into "/Applications/CLAN" folder and if you are using Windows PC, then it will go into "c:\TalkBank\CLAN" folder. If you installed CLAN in custom location, then you know where CLAN is located on your computer. Now open folder "eng/lex". Here you will see files that combine words into groups of particular parts of speech. You can see that there a number of files with "n-" and "v-" string. This is where deciding which words are nouns and which are verbs comes in. For example absolutely all verbs and all nouns can be found with this FREQ command:
freq +s"@|-n,|n:*,|-v,|-cop,|-aux,|-mod,|-mod:*,|-part" *.cha
This search includes nouns, pronouns, verbs, auxiliary and participle verbs and other variations of nouns and verbs. You can open each file in "eng/lex" folder to see a list of all words of each part of speech.
If you do not want to include pronouns or other variations of nouns in your count, then you would use command:
freq +s"@|-n,|-v,|-cop,|-aux,|-mod,|-mod:*,|-part" *.cha
In the purest form nouns and verbs are counted with this command:
But, if you want to count auxiliary and participle verbs along with basic verbs, then use this command:
freq +s"@|-n,|-v,|-aux,|-part" *.cha
As you can see you can fine tune your search to your particular specifications. All above FREQ command will output the whole form of each words. If you want to know only the count of each part of speech, then replace above four commands with following four commands:
freq +s"@|-n,|n:*,|-v,|-cop,|-aux,|-mod,|-mod:*,|-part,o-%" *.cha
freq +s"@|-n,|-v,|-cop,|-aux,|-mod,|-mod:*,|-part,o-%" *.cha
freq +s"@|-n,|-v,o-%" *.cha
freq +s"@|-n,|-v,|-aux,|-part,o-%"
There are simpler ways to look for verbs and nouns with FREQ command, but the more complex search patterns in above commands give you most precision. If you want to see the meaning of all those "|-" and "o-" symbol, then just type "freq +s@" command or for even more explanation look in CLAN manual.
If you want to analyze your own data or data that doesn't have MOR tags, then after you download and unzip MOR grammar you need to set "mor lib" directory to the location on hard drive where you placed the grammar folder. In my example above it will be on Mac "/Applications/CLAN/eng" folder and on PC "c:\TalkBank\CLAN\eng" folder. In CLAN's "Commands" window click on the button "mor lib", navigate to location of language grammar on your computer's hard drive and select that folder. Now you need to run two following commands:
mor +1 *.cha
post +1 *.cha
This will add "%mor" tier to all your data files and you will be ready to run your analyzes. If you have any CLAN questions, then please post them to the
chib...@googlegroups.com address and some will be able to help you.