TextGrid filename in csv?

. PAL Anusuya

unread,

Jul 12, 2023, 3:22:33 AM7/12/23

to Parselmouth

Dear all,

Thanks, Yannick, for making Parselmouth. It's really awesome.

I need to find the duration of the sound files for each syllable according to the annotated textgrid. I find the initial time (in my script, it is xmin), the final time (in my script, it is xmax), and the syllable label (in my script, it is text) according to the python script gien below. And, finally, I save the output as a .csv file.

The script is given below for your reference. Also, one of the sound files and the textgrids are attached. There is no problem, and it works absolutely fine.

I was just wondering if there is any way to also add the respective file name of the Textgrid (for example, s1_ba1_chase_iso.TextGrid) as an output along with xmin, xmax, and text.

Any suggestions would really be helpful.

Thanks,

Anu

#import libraries

import glob
import numpy as np
import pandas as pd
import parselmouth
import statistics
import textgrids

# create lists to put the results
xmin_list = []
xmax_list = []
text_list = []

for textgrid_file in glob.glob(r"C:\Users\anusu\Maram_Tone\ba\ba_chase\*.TextGrid"):
grid = textgrids.TextGrid(textgrid_file)
# Assume "syllables" is the name of the tier
# containing syllable information
for syll in grid["Mary"]:
if syll.containsvowel():
# Convert Praat to Unicode in the label
label = syll.text.transcode()
text_list.append(syll.text)
xmin_list.append(syll.xmin)
xmax_list.append(syll.xmax)

df = pd.DataFrame(np.column_stack([text_list, xmin_list, xmax_list]),columns=['text', 'xmin', 'xmax'])
df.to_csv("textgrid.csv", index=False)

s1_ba1_chase_iso.wav

s1_ba1_chase_iso.TextGrid

. PAL Anusuya

unread,

Jul 12, 2023, 4:31:39 AM7/12/23

to Parselmouth

Hi,

The problem is solved :-) if I add two lines in the for loop:

for textgrid_file in glob.glob(r"C:\Users\anusu\Maram_Tone\ba\ba_chase\*.TextGrid"):
grid = textgrids.TextGrid(textgrid_file)
# Assume "syllables" is the name of the tier
# containing syllable information
for syll in grid["Mary"]:

# if syll.containsvowel():

# Convert Praat to Unicode in the label
label = syll.text.transcode()
text_list.append(syll.text)
xmin_list.append(syll.xmin)
xmax_list.append(syll.xmax)

filename = os.path.join(textgrid_file)
filename_list.append(filename)

However, a new problem arises. In some of my files, the name of the tier (Mary) is the same; see the attached. In those files, I need only to select the first or second item (depending on the situation). Getting the "Mary" in the first item (item [1]) is problematic, as it does not stop and reads the second item in the list.

Does anyone have any idea how to just read the first item of the textgrid?

Thanks,

Anu

s6_ba1_chase_iso.wav

s6_ba1_chase_iso.TextGrid

yannick...@gmail.com

unread,

Jul 12, 2023, 4:14:36 PM7/12/23

to Parselmouth

Hi Anu

I'm not familiar with the textgrids library (I tend to use TextGridTools/tgt), and it's completely unrelated to Parselmouth, so I don't think I can help you more than the documentation for this textgrids library.

Do note that Python uses 0-based indexing, so [1] is the second item of a list.

Kind regards

Yannick

. PAL Anusuya

unread,

Jul 12, 2023, 11:19:01 PM7/12/23

to Parselmouth

I see. Thanks. Yannick, for your prompt reply.

Regards,

Anu

Jun Wang

unread,

Jul 20, 2023, 10:15:54 AM7/20/23

to Parselmouth

Hello,

me too, i need to calculate the duration of all syllabes, mais, i don't konw how to read and write the script. So i want to consult your scripts, it is possible ?

Thanks

Reply all

Reply to author

Forward