Phoneme posteriors probabilities

126 views
Skip to first unread message

Bego A.

unread,
Mar 9, 2017, 8:29:33 AM3/9/17
to phnrec
Hello,

I'm trying to get all the phone posteriors probabilities of every .wav file (my queries). Firstly, i had some problems with the sample rate but I've been able to solve them using sox. I use this code for get all the phone posteriors in my folder.

for wave in *.wav
do
sox $wave -r 8000 output.wav
mv output.wav $wave
string=$(echo $wave | cut -d'-' -f2 | cut -d'.' -f1)
./phnrec -c PHN_CZ_SPDAT_LCRC_N1500 -i $wave -o out1.post
./phnrec -c PHN_RU_SPDAT_LCRC_N1500 -i $wave -o out2.post
./phnrec -c PHN_HU_SPDAT_LCRC_N1500 -i $wave -o out3.post
./phnrec -c PHN_EN_TIMIT_LCRC_N500 -i $wave -o out4.post
cat out1.post out2.post out3.post out4.post > phoneposteriors$string.post
done
exit


What I get from this is;
000000 2100000 pau -41.710022
2100000 3300000 v -29.128357
3300000 4900000 s -37.955383
4900000 5500000 z -27.973251
000000 300000 i -8.718267
300000 2100000 pau -28.033392
2100000 2800000 n: -19.235672
2800000 3300000 n -7.451252
3300000 4800000 a: -22.798630
4800000 5500000 pau -23.796509
000000 500000 l -25.059494
500000 2100000 pau -22.504044
2100000 3000000 v -15.880817
3000000 3600000 O -17.841419
3600000 4800000 E -29.468086
4800000 5500000 spk -24.249283
000000 500000 t -16.065859
500000 1100000 hh -11.839073
1100000 1500000 ay -12.544214
1500000 2200000 k -15.780460
2200000 2700000 ay -15.217728


This is an example of the phone posteriors of one wave file. They are the best probabilites but I need all the probabilities of this file in order to use it to recognize audio (words) in a bigger audio file(conversation).


Thank you,

Begoña Aguirre

Petr Schwarz

unread,
Mar 9, 2017, 8:44:18 AM3/9/17
to phn...@googlegroups.com
Dear Begoña,

these are not phoneme posterior probabilities. This is one best string (the best phoneme sequence) coming from decoder.
The values at the end of lines are phoneme log likelihoods accumulated over the best path and do not sum to one.
Use the '-t post' switch to get posteriors. These are state posteriors (3 states per phoneme) for each 10 ms stored in the HTK feature format (see HTK Book http://htk.eng.cam.ac.uk/).
 
Best regards,
Petr

Dne 9.3.2017 v 12:09 Bego A. napsal(a):
--

---
You received this message because you are subscribed to the Google Groups "phnrec" group.
To unsubscribe from this group and stop receiving emails from it, send an email to phnrec+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
Message has been deleted
0 new messages