Question about marking clauses

34 views
Skip to first unread message

Nicole Tracy-Ventura

unread,
Apr 14, 2015, 11:02:25 AM4/14/15
to chibolts
Dear all,

I'm interested in marking clauses in my transcripts with the [^c] marker. I also want to mark whether the clauses contain errors so that I can count the number of clauses with errors and those without. Would the best way to do this be to just add [*] after the [^c] if there is an error anywhere in the clause? 

I tried doing this and ran a simple freq command adding +s"[^c]^[*]" but it said there were none. Adding +s"[^c]" by itself worked fine to get the total number of clauses. 

Thank you in advance for any help!

Nicole


Nicole Tracy-Ventura
University of South Florida

Brian MacWhinney

unread,
Apr 14, 2015, 2:04:59 PM4/14/15
to chib...@googlegroups.com
Dear Nicole,

Great question.  I think the best way to do this is to use a variant of [^c] in the cases in which the clause contains an error.  Let’s say that this new code is [^d].  Then you can simply count both [^c] and [^d] using FREQ, as you mention below.  The only catch here is that we had to slightly modify the depfile.cut to allow this.  So, if you get a new version of CLAN with this new depfile.cut, it will allow [^d] (or any other code of this form) along with [^c].   I also modified the relevant section of the CLAN manual to read as follows.

—Brian MacW

Clause Delimiter                             [^c] 

If you wish to conduct analyses such as MLU and MLT based on clauses rather than utterances as the basic unit of analysis, you should mark the end of each clause with this symbol.  It is not necessary to mark the scope of this symbol, since it is assumed to apply to all the material before it up to the beginning of the utterance or up to the preceding [^c] marker. It is possible to create additonal user-defined single-letter codes using this format, such as [^d] which could be defined as a marker of a clause that includes an error. Then, inside the MLU and MLT programs, you need to add the +c switch to specify exactly which codes of this type should be recognized.


--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To post to this group, send email to chib...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CA%2B3CKJ5mRuFy4MWXiAZP6G89UJE59hFSZSdPH5a4M6_k9EReEQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Nicole Tracy-Ventura

unread,
Apr 14, 2015, 6:56:25 PM4/14/15
to Leonid Spektor, Brian MacWhinney, chibolts
Dear both,

Thank you so much for the quick response and for adjusting CLAN to make this analysis possible. I really appreciate it!

Best,
Nicole



On Tue, Apr 14, 2015 at 3:10 PM, Leonid Spektor <spe...@andrew.cmu.edu> wrote:
Nicole,

We gave this idea more consideration and using our existing conventions we decided to handle clauses differently. You can still have symbol [^c] for clauses without errors. But, for clauses with errors you can have a generic code [^c *], or a more specific codes  [^c error type 1],  [^c error type 2] and so on  [^c error ...]. This convention will make it easier for FREQ to search for all clauses using this command: freq +s"[^c*]". Or for only clauses without errors with this command:
freq +s"[^c]". Or for specific error/type clause with this command: freq +s"[^c error type 1]".


Leonid.

Reply all
Reply to author
Forward
0 new messages