Google Groups Home
Help | Sign in
Working with codes
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  2 messages - Collapse all
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
bart...@zas.gwz-berlin.de  
View profile
 More options Jun 19, 7:43 am
From: bart...@zas.gwz-berlin.de
Date: Thu, 19 Jun 2008 11:43:29 GMT
Local: Thurs, Jun 19 2008 7:43 am
Subject: Working with codes
Dear all,

For a study on anaphora, we are coding referring expressions in
children's narratives and I have some questions concerning the coding
line (we use %cod), as well as subsequent CLAN analyses.

First, is such a %cod line legal?
*CHI:   der hund beisst sie in den schwanz.
%cod:   der hund|S-DA:N-BL-V1:3-AS-DIR-hundC sie|O-PRO:PP-BL-V3:1-AS-DIS-
katzeC den schwanz|NSO-DA:N-UBL-NV-IND-schwanzC

We use the minus symbol - for separating 7 levels of coding of each
referring expression, e.g., syntactic position, lexical realisation,
in/animacy, referent introduction vs. reference maintenance, etc. The
symbol : is used for separating sub-levels within each of the 7
superordinated levels.

Secondly, is there any possibility to link each referring expression
on the *CHI line with its coding on the %cod line? Provisionally, we
opted for typing the referring expression before the coding string,
e.g., 'die katze'.

Thidly and most importantly, we want to conduct analyses concerning
the cooccurrence of elements within each coding string. For instance,
we want to investigate differences in children's realisation of
referents as a function of referent introduction vs. anaphorical
expression (reference maintenance). For that, we want to find a range
of cooccurrences as the following:

DA:N and NV and IND
(where DA:N means definite article + noun, NV means referent
introduction, and IND means indirect anaphor)

I have tried COMB, but either I don't understand the principle for the
syntax of the command line or I miss some important switch or, well, I
don't know what.

Two things are in such searching procedures very important for us:
 - The search must be limited to each of the coding strings and not be
based on the whole %cod line. For instanance, when looking for the
cooccurrence  DA:N and DIS, CLAN would be supposed not to find it in
the example above, since it doesn't occur in any of the 3 coding
strings. That is, for this concrete example, how can we proceed for
ensuring that CLAN ignores the cooccurrence of DA:N for 'the hund' and
DIS for 'sie'?
 - How can we proceed to get quantitative results of such searches? I
mean, in addition to the concrete hits showed in the output window,
it'd be very important to have the number of cooccurrences found in
each chat file, as well as in all chat files in which the cooccurrence
was looked for.

I apologize if the answers for my questions are obvious or easy to be
found  in the CLAN manual. I have read the manual very carefully
before sending this query, but I don't seem to be able to find the
needed answers therein.

Many, many thanks in advance for any hint.

Kind regards,
Susanna

*****************************************************************
Susanna Bartsch
https://www.zas.gwz-berlin.de/mitarb/homepage/bartsch/
bart...@zas.gwz-berlin.de
Zentrum fuer Allgemeine Sprachwissenschaft (ZAS)
Centre for General Linguistics
Schuetzenstr. 18
10117 Berlin
Germany
Tel. +49 (0)30 20 192 503
Fax  +49 (0)30 20 192 402
*****************************************************************


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Brian MacWhinney  
View profile
 More options Jun 20, 5:58 pm
From: Brian MacWhinney <m...@cmu.edu>
Date: Fri, 20 Jun 2008 17:58:35 -0400
Local: Fri, Jun 20 2008 5:58 pm
Subject: Re: Working with codes
Dear Susanna,

      Sorry about the delay in replying.  I have been traveling.   Let  
me try to answer some of these questions below.

--Brian MacWhinney

On Jun 19, 2008, at 7:43 AM, bart...@zas.gwz-berlin.de wrote:

> Dear all,

> For a study on anaphora, we are coding referring expressions in
> children's narratives and I have some questions concerning the coding
> line (we use %cod), as well as subsequent CLAN analyses.

> First, is such a %cod line legal?
> *CHI:      der hund beisst sie in den schwanz.
> %cod:      der hund|S-DA:N-BL-V1:3-AS-DIR-hundC sie|O-PRO:PP-BL-V3:1-AS-
> DIS-
> katzeC den schwanz|NSO-DA:N-UBL-NV-IND-schwanzC

For the dependent tier lines like %cod, pretty much everything is  
legal, since the programs
don't presume any particularly structure on this line.  For these  
lines, the main issue is a practical one relating to composing the +s  
switch when you need to do searching.  Just make sure that you can  
find the things you want to find by testing out some FREQ or KWAL  
commands in advance.

> We use the minus symbol - for separating 7 levels of coding of each
> referring expression, e.g., syntactic position, lexical realisation,
> in/animacy, referent introduction vs. reference maintenance, etc. The
> symbol : is used for separating sub-levels within each of the 7
> superordinated levels.

This is fine.  You will have to have search strings like +s"*-*-*-*-BL-
*" and such.  Personally, I would find this confusing and prone to  
error, but if you are good at asterisk counting, this will work.

> Secondly, is there any possibility to link each referring expression
> on the *CHI line with its coding on the %cod line? Provisionally, we
> opted for typing the referring expression before the coding string,
> e.g., 'die katze'.

Ah, herein lies the rub (somewhere in Shakespeare).  You are basically  
trying to construct something like the %mor line with its 1-to-1 match  
to the main line.  This is a great idea.  However, the CLAN software  
is not yet really ready for this.  We are currently right in the  
middle of implementing strict 1-to-1 matching between the %mor and the  
main tier within the XML version of CLAN.  Once this is finished then  
"match" searches will work with the %mor line.  At that point, it  
would be relatively easy to extend this to a tier called %mat for a  
user-defined matching tier.  However, none of this will be ready until  
later this year.

> Thidly and most importantly, we want to conduct analyses concerning
> the cooccurrence of elements within each coding string. For instance,
> we want to investigate differences in children's realisation of
> referents as a function of referent introduction vs. anaphorical
> expression (reference maintenance). For that, we want to find a range
> of cooccurrences as the following:

> DA:N and NV and IND
> (where DA:N means definite article + noun, NV means referent
> introduction, and IND means indirect anaphor)

I am not sure what you mean by "range" in your phrase "a range of  
cooccurrences".  However, finding *-DA:N-*^*^*-NV-* should be possible.

> I have tried COMB, but either I don't understand the principle for the
> syntax of the command line or I miss some important switch or, well, I
> don't know what.

You probably just have to play around to learn how to use COMBO.

> Two things are in such searching procedures very important for us:
> - The search must be limited to each of the coding strings and not be
> based on the whole %cod line. For instanance, when looking for the
> cooccurrence  DA:N and DIS, CLAN would be supposed not to find it in
> the example above, since it doesn't occur in any of the 3 coding
> strings. That is, for this concrete example, how can we proceed for
> ensuring that CLAN ignores the cooccurrence of DA:N for 'the hund' and
> DIS for 'sie'?

That should be easy enough.  In COMBO lines, it is the ^ that searches  
across word boundaries.  Just make sure that your search strings don't  
include the ^.  So, you want
*-DA:N-*-DIS-*

> - How can we proceed to get quantitative results of such searches? I
> mean, in addition to the concrete hits showed in the output window,
> it'd be very important to have the number of cooccurrences found in
> each chat file, as well as in all chat files in which the cooccurrence
> was looked for.

> I apologize if the answers for my questions are obvious or easy to be
> found  in the CLAN manual. I have read the manual very carefully
> before sending this query, but I don't seem to be able to find the
> needed answers therein.

I don't think you can really learn this stuff by reading the manual.  
You just have
to devote an hour or two to playing around with COMBO.  Think of it as  
a Bach
theme with variations.

--Brian MacWhinney


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2008 Google