output of single channel and dual channels

53 views
Skip to first unread message

Ma Jambo

unread,
Jun 8, 2015, 7:06:35 PM6/8/15
to phn...@googlegroups.com
Hi, everyone,
I am a PHD student. Now I would like to use the Phnrec to extract phoneme information. But I have some troubles in working on it. Hope anyone can help me.
I may want to use the time information to determine the start and end point of a particular phoneme. But I found the time information is not compatible with the duration of an utterance.
For example, my input file is a 6minutes, 8000KHz(sample frequency), 2 channel file. The time for the last phoneme is 3599300000, which is 5.99 minutes. But when I use the same file but converted it into single channel,
the time for the last phoneme is 1799600000, which is 3 minutes and half of the previous one.
Can anyone please help me with this?
Thank you very much.
Reagrds,
Jambo

Petr Schwarz

unread,
Jun 8, 2015, 7:16:51 PM6/8/15
to phn...@googlegroups.com
Dear Jambo,

the software does not work with stereo files. It supposes raw mono 8kHz
16 bit PCM files (or A-law if specified by a switch). So if you enter a
stereo files, you have 2x more samples.

Petr

Dne 8.6.2015 v 9:49 Ma Jambo napsal(a):
> --
>
> ---
> You received this message because you are subscribed to the Google
> Groups "phnrec" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to phnrec+un...@googlegroups.com
> <mailto:phnrec+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

Ma Jambo

unread,
Jun 8, 2015, 10:28:46 PM6/8/15
to phn...@googlegroups.com
Dear Petr,
Thank you for your help.
But I think I am still confused.Here I would like explain my confusion and what I have done.
For example, I use the file "sw_45098.wav" and the information of it is below in the red circle. This is a mono file. And the duration of it is almost 6 minutes. And then I converted it into raw file.


The problem is that after I use the software and I read the results in .rec file, the time of last phoneme recognized is half of the duration of the original file. I calculate the time (1799600000/600000000=2.99(mins)) is almost 3 minutes.

This is what I have done and my confusion. Do you think my steps are right? Or there is something wrong and cause the problem?
Thank you.
Regards,
Jambo
在 2015年6月9日星期二 UTC+10上午9:16:51,Petr Schwarz写道:

Ma Jambo

unread,
Jun 8, 2015, 10:31:54 PM6/8/15
to phn...@googlegroups.com
Sorry. Maybe you can't see the pictures. I send the pictures in case you cannot see them.


在 2015年6月9日星期二 UTC+10上午9:16:51,Petr Schwarz写道:
Dear Jambo,
1.png
2.png
3.png
Reply all
Reply to author
Forward
0 new messages