update

29 views
Skip to first unread message

Brian Macwhinney

unread,
Jul 5, 2023, 3:15:12 PM7/5/23
to cab...@googlegroups.com
Dear CABank,
As you may have noticed, this mailing list is not being used frequently. However, it still can serve as a good way of updating CA users of new developments related to CA and CABank within TalkBank. There are two major developments in TalkBank that will interest people doing CA.
The first is the success we have been having recently with the use of automatic speech recognition (ASR) to create initial versions of transcripts from media. In the summer of 2022, we worked with Houjun Liu to apply the Rev-AI ASR system and the Montreal Forced Aligner (MFA) to TalkBank data using a Python script. This system has been remarkably successful, reducing transcription time to about 4 times recording time. We are now using it to automatically transcribe new data, transcribe untranscribed audio, and time-align FluencyBank and other TalkBank data. Although the output of ASR will still need a lot of work to create a proper CA transcript, the initial output is nicely linked to the audio and provides are good framework for further work. An open-access article describing this "Batchalign" system is now available at https://doi.org/10.1044/2023_JSLHR-22-00642 and we have made the Batchalign system publicly available at https://github.com/talkbank . We are happy to provide email and Zoom support for users who want to explore use of this system.
The second development seems perhaps even more directly relevant to work in CA. This is the Collaborative Commentary or CC system at https://talkbank.org/CC. CC allows project groups to create a set of tags for language behaviors and locate instances of those tags in CHILDES data available directly through the TalkBankBrowser in the web. Eight research groups are using the alpha version of this system for teaching and research. Three are using CC for data from AphasiaBank, one for ClassBank data, one for DementiaBank data, and three for CHILDES data. However, it is difficult to think of an area of language analysis for which CC would be more ideal than CC. I have used the system in classes at CMU asking students to analyze multimodel interactions from CABank, CHILDES, ClassBank, and AphasiaBank. It gives them a remarkably direct access to the details of interactions, as well as the ability to code and analyze patterns. They can enter comments and codes directly into the transcript in the browser and I can even send email to them based exactly on the analyses they have produced. Apart from its use for teaching, the system has provided great support for researchers. For example, a group in Australia is examining CA features in the language of people recovering from traumatic brain injury. And a group of educators in the States has been examining the role of academically productive talk (APT) in a set of 44 classroom interactions.

If people are interested in making use of either of these new systems, we are more than happy to provide support through email and Zoom.

Best regards,

— Brian MacWhinney
Teresa Heinz Professor of Cognitive Psychology,
Language Technologies and Modern Languages, CMU


Corey Miller

unread,
Jul 5, 2023, 4:37:34 PM7/5/23
to cab...@googlegroups.com
Thank you, Brian!

--
You received this message because you are subscribed to the Google Groups "CABank" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cabank+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cabank/8BA3122A-2039-469E-AB6B-00643A914BAF%40cmu.edu.
Reply all
Reply to author
Forward
0 new messages