Dear Susanna,
Sorry about the delay in replying. I have been traveling. Let
me try to answer some of these questions below.
--Brian MacWhinney
On Jun 19, 2008, at 7:43 AM, bart...@zas.gwz-berlin.de wrote:
> Dear all,
> For a study on anaphora, we are coding referring expressions in
> children's narratives and I have some questions concerning the coding
> line (we use %cod), as well as subsequent CLAN analyses.
> First, is such a %cod line legal?
> *CHI: der hund beisst sie in den schwanz.
> %cod: der hund|S-DA:N-BL-V1:3-AS-DIR-hundC sie|O-PRO:PP-BL-V3:1-AS-
> DIS-
> katzeC den schwanz|NSO-DA:N-UBL-NV-IND-schwanzC
For the dependent tier lines like %cod, pretty much everything is
legal, since the programs
don't presume any particularly structure on this line. For these
lines, the main issue is a practical one relating to composing the +s
switch when you need to do searching. Just make sure that you can
find the things you want to find by testing out some FREQ or KWAL
commands in advance.
> We use the minus symbol - for separating 7 levels of coding of each
> referring expression, e.g., syntactic position, lexical realisation,
> in/animacy, referent introduction vs. reference maintenance, etc. The
> symbol : is used for separating sub-levels within each of the 7
> superordinated levels.
This is fine. You will have to have search strings like +s"*-*-*-*-BL-
*" and such. Personally, I would find this confusing and prone to
error, but if you are good at asterisk counting, this will work.
> Secondly, is there any possibility to link each referring expression
> on the *CHI line with its coding on the %cod line? Provisionally, we
> opted for typing the referring expression before the coding string,
> e.g., 'die katze'.
Ah, herein lies the rub (somewhere in Shakespeare). You are basically
trying to construct something like the %mor line with its 1-to-1 match
to the main line. This is a great idea. However, the CLAN software
is not yet really ready for this. We are currently right in the
middle of implementing strict 1-to-1 matching between the %mor and the
main tier within the XML version of CLAN. Once this is finished then
"match" searches will work with the %mor line. At that point, it
would be relatively easy to extend this to a tier called %mat for a
user-defined matching tier. However, none of this will be ready until
later this year.
> Thidly and most importantly, we want to conduct analyses concerning
> the cooccurrence of elements within each coding string. For instance,
> we want to investigate differences in children's realisation of
> referents as a function of referent introduction vs. anaphorical
> expression (reference maintenance). For that, we want to find a range
> of cooccurrences as the following:
> DA:N and NV and IND
> (where DA:N means definite article + noun, NV means referent
> introduction, and IND means indirect anaphor)
I am not sure what you mean by "range" in your phrase "a range of
cooccurrences". However, finding *-DA:N-*^*^*-NV-* should be possible.
> I have tried COMB, but either I don't understand the principle for the
> syntax of the command line or I miss some important switch or, well, I
> don't know what.
You probably just have to play around to learn how to use COMBO.
> Two things are in such searching procedures very important for us:
> - The search must be limited to each of the coding strings and not be
> based on the whole %cod line. For instanance, when looking for the
> cooccurrence DA:N and DIS, CLAN would be supposed not to find it in
> the example above, since it doesn't occur in any of the 3 coding
> strings. That is, for this concrete example, how can we proceed for
> ensuring that CLAN ignores the cooccurrence of DA:N for 'the hund' and
> DIS for 'sie'?
That should be easy enough. In COMBO lines, it is the ^ that searches
across word boundaries. Just make sure that your search strings don't
include the ^. So, you want
*-DA:N-*-DIS-*
> - How can we proceed to get quantitative results of such searches? I
> mean, in addition to the concrete hits showed in the output window,
> it'd be very important to have the number of cooccurrences found in
> each chat file, as well as in all chat files in which the cooccurrence
> was looked for.
> I apologize if the answers for my questions are obvious or easy to be
> found in the CLAN manual. I have read the manual very carefully
> before sending this query, but I don't seem to be able to find the
> needed answers therein.
I don't think you can really learn this stuff by reading the manual.
You just have
to devote an hour or two to playing around with COMBO. Think of it as
a Bach
theme with variations.
--Brian MacWhinney
> Many, many thanks in advance for any hint.
> Kind regards,
> Susanna
> *****************************************************************
> Susanna Bartsch
> https://www.zas.gwz-berlin.de/mitarb/homepage/bartsch/
> bart...@zas.gwz-berlin.de
> Zentrum fuer Allgemeine Sprachwissenschaft (ZAS)
> Centre for General Linguistics
> Schuetzenstr. 18
> 10117 Berlin
> Germany
> Tel. +49 (0)30 20 192 503
> Fax +49 (0)30 20 192 402
> *****************************************************************