Hi, please help me with my questions.
I've used the software (PHN_EN_TIMIT_LCRC_N500).
First, I would like to ask the sample rate of the input file. When I use the file in TIMIT database with 16KHz, the result phonemes can match the transcripts. But when I used sox downsample the file, the phonemes can no longer match the utterance. And the time for the last phoneme is half the duration of the utterance. So I want to confirm the sample rate of the input file.
The second question I want to ask is that if I have a stereo file, e.g. 2 channel, can I just use sox to convert it into single channel and then use it as input of the software?