Hello,
I am currently working with transcripts that are multilingual (e.g., English/Spanish, English/Korean). They are around 70 - 100 utterances or so of parent-reported first words/phrases for children between 12 - 26 months, so they are around 1 - 3 words per utterance, but occasionally longer. We have asked parents to report what their child said across multiple days, in whichever language they used.
We would like to extract lemmas and consider unilemmas (e.g., Mommy, Mamá - Spanish, 어마 - Korean) both across children who speak different languages and within a child who might use multiple languages. To facilitate this I was wondering if batchalign would work with multilingual transcripts?
Thank you!
Janet