Speech Signal Separation for Voice/Unvoice parts

Slava

unread,

Dec 1, 2009, 2:23:01 PM12/1/09

to

Hello, all.
As a part of my Degree project, I need to use Matlab for voice/unvoice separation of the recorded speech file.
For now I can record a file in Matlab, and calculate an Average Zerro-Crossing, Autocorolation and Short-time Energy.
But afterward I need to "show" or to separate recorded signal to "voiced" and "unvoiced" parts.
I understand that I need some how perform a comparison of zerro crossing and energy calculation vectors....

Please assist.

Thank you in advance.
---------
Slava

yogesh angal

unread,

Dec 17, 2009, 3:28:22 PM12/17/09

to

"Slava " <sla...@gmail.com> wrote in message <hf3qel$k7b$1...@fred.mathworks.com>...

> > Slava
Dear Slava,
RONALD W. SCHAFER and LAWRENCE R. WBINER"Digital Representations of Speech Signals"PROCEEDINGS OF THE IEEE, VOL. 63, NO. 4, APRIL 1975
1)The major significance of E(n) is that it provides a good measurefor separating voiced speech segments from unvoiced speech segments. E(n) for unvoiced segments is much
smaller than for voiced segments
2)It is well known that the energy of voiced speech tends to be concentrated below 3 kHz, whereas the energy of fricatives generally is concentrated above 3 kHz.Thus, zero crossing measurements (along with energy information)are often used in making a decision about whether a particular segment of speech is voiced or unvoiced. If the zero
crossing rate is high, the implication is unvoiced; if the zero crossing rate is low, the segment is most likely to be voiced.

Yogesh S Angal

Slava

unread,

Dec 18, 2009, 3:04:04 AM12/18/09

to

> Dear Slava,
> RONALD W. SCHAFER and LAWRENCE R. WBINER"Digital Representations of Speech Signals"PROCEEDINGS OF THE IEEE, VOL. 63, NO. 4, APRIL 1975
> 1)The major significance of E(n) is that it provides a good measurefor separating voiced speech segments from unvoiced speech segments. E(n) for unvoiced segments is much
> smaller than for voiced segments
> 2)It is well known that the energy of voiced speech tends to be concentrated below 3 kHz, whereas the energy of fricatives generally is concentrated above 3 kHz.Thus, zero crossing measurements (along with energy information)are often used in making a decision about whether a particular segment of speech is voiced or unvoiced. If the zero
> crossing rate is high, the implication is unvoiced; if the zero crossing rate is low, the segment is most likely to be voiced.
>
> Yogesh S Angal

Thank you for your answer, Yogesh.
But, how can I perform this separation in matlab?
I do know how to record a signal and how to check it for Average Energy and Average Zero cross.
And afterwards I have 2 vectors:
1 = Average Energy of recorded signal.
2 = Average Zero crossing of recorded signal.
What procedures I need to do, in order to separate (and plot, of course) my recorded voice file for "voiced" and "unvoiced" parts.
I need may be some how to compare these vectors? Divide them?

Thank you.

yogesh angal

unread,

Dec 18, 2009, 2:48:04 PM12/18/09

to

"Slava " <sla...@gmail.com> wrote in message <hgfd1k$j86$1...@fred.mathworks.com>...

Dear Slava,
In the frame-by-frame processing stage, the speech signal is segmented into a non-overlapping frame of samples. It is processed into frame by frame until the entire speech signal is covered. Prepare a Table which includes the Energy and Zero crossing for each frame then decison will be taken voiced/unvoiced decisions for the signal under consideration e.g signal has 3600 samples with 8000Hz sampling rate. At the
beginning, we set the frame size as 400 samples. At the end of the algorithm if the decision is not clear,energy and zero-crossing rate is recalculated by dividing the related frame size into two frames.
This will give the correrct information about voiced and Unvoiced part of th signal.

Ask me if any problem
Yogesh

yogesh angal

unread,

Dec 18, 2009, 2:48:04 PM12/18/09

to

"Slava " <sla...@gmail.com> wrote in message <hgfd1k$j86$1...@fred.mathworks.com>...

Ramdas Dongare

unread,

Feb 20, 2010, 2:38:04 AM2/20/10

to

"yogesh angal" <yoges...@yahoo.co.in> wrote in message <hggm9k$8a1$1...@fred.mathworks.com>...
dear sir,

can we get end point detection using only energy in frames?

akshay1...@gmail.com

unread,

Nov 26, 2012, 11:07:41 PM11/26/12

to Slava

Hey! i am working on the same project and just say your post ,is your problem resolved if so then do mail me the solution to separate out the voiced and unvoiced part ..

Syam Prasad Boddu

unread,

Feb 19, 2014, 5:57:09 AM2/19/14

to

"Ramdas Dongare" <ramd...@gmail.com> wrote in message <hlo3gs$r2v$1...@fred.mathworks.com>...

> "yogesh angal" <yoges...@yahoo.co.in> wrote in message <hggm9k$8a1$1...@fred.mathworks.com>...
> dear sir,
>
> can we get end point detection using only energy in frames?

hi sir,
i am also doing same project. but i can't understand how the speech signal get separated into voiced and unvoiced speech by using energy and zero crossing rate in matlab.. please help me as early as possible. provide matlab program

Thanks in advance