TextGrid filename in csv?

112 views
Skip to first unread message

. PAL Anusuya

unread,
Jul 12, 2023, 3:22:33 AM7/12/23
to Parselmouth
Dear all, 

Thanks, Yannick, for making Parselmouth. It's really awesome. 

I need to find the duration of the sound files for each syllable according to the annotated textgrid. I find the initial time (in my script, it is xmin), the final time (in my script, it is xmax), and the syllable label (in my script, it is text) according to the python script gien below. And, finally, I save the output as a .csv file. 

The script is given below for your reference. Also, one of the sound files and the textgrids are attached. There is no problem, and it works absolutely fine. 

I was just wondering if there is any way to also add the respective file name of the Textgrid (for example, s1_ba1_chase_iso.TextGrid) as an output along with xmin, xmax, and text. 

Any suggestions would really be helpful. 

Thanks,
Anu

#import libraries
import glob
import numpy as np
import pandas as pd
import parselmouth
import statistics
import textgrids

# create lists to put the results
xmin_list = []
xmax_list = []
text_list = []

for textgrid_file in glob.glob(r"C:\Users\anusu\Maram_Tone\ba\ba_chase\*.TextGrid"):
    grid = textgrids.TextGrid(textgrid_file)
    # Assume "syllables" is the name of the tier
    # containing syllable information
    for syll in grid["Mary"]:
        if syll.containsvowel():
        # Convert Praat to Unicode in the label
            label = syll.text.transcode()
            text_list.append(syll.text)
            xmin_list.append(syll.xmin)
            xmax_list.append(syll.xmax)

df = pd.DataFrame(np.column_stack([text_list, xmin_list, xmax_list]),columns=['text', 'xmin', 'xmax'])
df.to_csv("textgrid.csv", index=False)
s1_ba1_chase_iso.wav
s1_ba1_chase_iso.TextGrid

. PAL Anusuya

unread,
Jul 12, 2023, 4:31:39 AM7/12/23
to Parselmouth
Hi, 

The problem is solved :-) if I add two lines in the for loop: 

for textgrid_file in glob.glob(r"C:\Users\anusu\Maram_Tone\ba\ba_chase\*.TextGrid"):
    grid = textgrids.TextGrid(textgrid_file)
    # Assume "syllables" is the name of the tier
    # containing syllable information
    for syll in grid["Mary"]:
        # if syll.containsvowel():

        # Convert Praat to Unicode in the label
         label = syll.text.transcode()
         text_list.append(syll.text)
         xmin_list.append(syll.xmin)
         xmax_list.append(syll.xmax)
         filename = os.path.join(textgrid_file)
         filename_list.append(filename)

However, a new problem arises. In some of my files, the name of the tier (Mary) is the same; see the attached. In those files, I need only to select the first or second item (depending on the situation). Getting the "Mary" in the first item (item [1]) is problematic, as it does not stop and reads the second item in the list. 

Does anyone have any idea how to just read the first item of the textgrid? 

Thanks, 
Anu
s6_ba1_chase_iso.wav
s6_ba1_chase_iso.TextGrid

yannick...@gmail.com

unread,
Jul 12, 2023, 4:14:36 PM7/12/23
to Parselmouth
Hi Anu

I'm not familiar with the textgrids library (I tend to use TextGridTools/tgt), and it's completely unrelated to Parselmouth, so I don't think I can help you more than the documentation for this textgrids library.
Do note that Python uses 0-based indexing, so [1] is the second item of a list.

Kind regards
Yannick

. PAL Anusuya

unread,
Jul 12, 2023, 11:19:01 PM7/12/23
to Parselmouth
I see. Thanks. Yannick, for your prompt reply. 
Regards,
Anu

Jun Wang

unread,
Jul 20, 2023, 10:15:54 AM7/20/23
to Parselmouth
Hello, 
me too, i need to calculate the duration of all syllabes, mais, i don't konw how to read and write the script. So i want to consult your scripts, it is possible ? 
Thanks 

Reply all
Reply to author
Forward
0 new messages