Syntactic Complexity Measure?

Kimberly Mueller

unread,

Jun 12, 2015, 10:13:21 AM6/12/15

to chib...@googlegroups.com

Dear All,

Kimberly Mueller here from UW Madison, using CHAT/CLAN (thank you!!!) to transcribe 5-10 minute language samples from adults ages 40 - 80 who are at risk for developing Alzheimer's Disease.

Would anyone (and/or everyone!) please provide guidance on measures you recommend to capture syntactic complexity? We have been segmenting utterances using the guidance in the CHAT manual (see below). Any help with codes/commands would also be appreciated!

Many thanks,
Kim

Utterance Manual for CHAT/CLAN

Utterances in CHAT/CLAN are separated using a T-unit classification.

A T-unit consists of an independent clause and its corresponding depending clauses.

An independent clause includes a subject and a verb.

A dependent clause provides additional information to an independent clause, but it cannot stand by itself.

For example, if you were transcribing “I went to the store, but I didn’t find anything to buy,” you would separate this into two utterances in CHAT/CLAN.

*PAR: I went to the store.

*PAR: but I didn’t find anything to buy.

Even though these two parts go together in a sentence, they would be separated into two utterances per the T-unit classification.

An example of a dependent and independent clause would be as follows:

*PAR: If I show up late the teacher will give me a tardy.

In CHAT/CLAN, you also need to make a few judgement calls.

For example, in speech, we often start sentences with “because.” “Because” is typically at the beginning of a dependent clause in written communication; however, we use this to start sentences when speaking and thus a transcriber needs to decide whether the “because” is actually starting the sentence in a spoken utterance.

Brian MacWhinney

unread,

Jun 12, 2015, 12:20:36 PM6/12/15

to ChiBolts, Kimberly Mueller

Dear Kim,

None of the current versions of either the CHAT or CLAN manuals includes the material you are citing. Perhaps you are working with some old version of the manual or perhaps some locally-prepared guideliness. Best to get new versions from the web.

There are other ways, besides t-units, to look at syntactic complexity. MLU is not bad for children or aphasics, but I don’t yet know about AD. CLAN also includes the IPSyn and DSS programs, but again they are really targeted to studies of language development. Recently we introduced the CPIDR propositional density measure which played a big role in the Nun’s Study for AD. Perhaps that would be useful in your case. All of these measures require you to have first run MOR on your transcript to create a %mor line.

—Brian MacWhinney

--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To post to this group, send email to chib...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/8d9a173c-9956-4fb2-b8dc-573728451542%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

RSteinkrauss

unread,

Jun 13, 2015, 4:44:49 AM6/13/15

to chib...@googlegroups.com

Dear Kim,

in our spoken data of elderly L1 attriters and L2 learners, we 
hand-coded a specific syntactic complexity tier for information such as 
clause type, number of finite and non-finite verbs, and noun phrase 
length tier to measure syntactic complexity because the measures we were 
interested in were not directly supported by CLAN. We based our choice 
of measures on Bulté and Housen's and Norris and Ortega's work; you 
might want to look into their findings.

Regarding your data, a relatively simple way to get at two widely-used 
syntactic complexity data would be to proceed the way described in the 
material you cited and additionally introduce a tier where you code for 
every clause whether it is a independent or dependent clause. Using CLAN 
to calculate the number of dependent and independent clauses in each 
transcript as well as the total number of words in each transcript would 
then allow you to calculate the average length of a T-unit in words 
(total words/number of indep. clauses) and a kind of subordination ratio 
(dependent/independent clauses).

However, I don't know if these measures are appropriate for relating 
them to the risk of AD. Also, since you are dealing with spoken data, 
I'd recommend looking into AS- instead of T-units as your unit of 
analysis (see Foster's work).

Regards,
Rasmus Steinkrauss

Bernadette Plunkett

unread,

Jun 13, 2015, 9:22:13 AM6/13/15

to chib...@googlegroups.com

Dear Kim,

once you have found a good measure, do you have in mind what you are going to measure it against? Does anyone know whether such measures have been applied to the speech of the adults in some of the child corpora, for example? I know some people will be of the view that the child directed speech is simpler than normal adult speech, but it would provide some degree of measure and looking at some of my own data the adults speaking to three and four year olds are using pretty sophisticated syntax. Sophisticated though it may be, it doesn't however, always include alot of embedding so I don't know how representative simply counting dependent clauses would be.

I'd be happy to hear what measures you fix on once you've experimented a bit.

Bernadette Plunkett

--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To post to this group, send email to chib...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/e8da8e97-e0fe-426d-8703-53021a95124d%40googlegroups.com.

Reply all

Reply to author

Forward