# [翻譯] 用於 FFT 程式中的 framedelta~ 和 frameaccum~ 定義 （轉貼自舊討論區）

269 views

### Chien-Wen Cheng

Jun 19, 2009, 9:24:57 PM6/19/09
to MAX/MSP/Jitter 互動音樂、互動藝術論壇
CWCheng

frameaccum~ 定義

--------------------------------------------------------------------------------

framedelat~: 用來計算正運行相位差（running phase deviation），藉由將前一個時間點的訊號向量和現在的訊號向量

The framedelta~ object computes a running phase deviation by
subtracting values in each position of its previously received signal
vector from the current signal vector. In other words, for each signal
vector, the first sample of its output will be the first sample in the
current signal vector minus the first sample in the previous signal
vector, the second sample of its output will be the second sample in
the current signal vector minus the second sample in the previous
signal vector, and so on. When used inside a pfft~ object, it keeps a
running phase deviation of the FFT because the FFT size is equal to
the signal vector size.

frameaccum~: 用來計算運行相位和，是輸入的訊號向量中的每一個位置之數值總和。換句話說，對於每一個訊號向量而言，第一個樣本將會是已經

The frameaccum~ object computes a running phase by keeping a sum of
the values in each position of its incoming signal vectors. In other
words,
for each signal vector, the first sample of its output will be the sum
of all of the first samples in each signal vector it has received, the
second sample of its output will be the sum of all the second samples
in each signal vector, and so on. When used inside a pfft~ object, it
can keep a running phase of the FFT because the FFT size is equal to
the signal vector size.

CWCheng

--------------------------------------------------------------------------------

phase vocoder 的技巧，可以將聲音拉長或縮短，而不改變音高。

I agree that the phase wrapping is not necessary in ALL cases in a
spectral delay (I haven't look at the example in question though so
read on to find out when it is). If you miss it out you should also
polar co-ordinates at all because the trig is really expensive to do
this. In all my spectral work I try to avoid using polar values for
this reason. So far it has always been possible.

So, i said not in all cases - the reason for this is that for fixed
delay this works correctly (and arguably more accurately), but not for
variable delays (where NOT accumulating will not sound smooth because
the phases will not be read from consecutive frames in the buffer as
the delay changes so the resultant phase differences will not make
sense). In this case it is technically possible to phase accumulate
using cartesian geometry (using complex multiplies and divides) which
is cheaper. This is hard to do in msp code however. I have made an
external that does this (actually for my spectral delay - so i have
tried all these different options in practice) - I may post it to the
share site soon if I can find time to neaten it up and port to UB.

To summarise - if you want a fixed delay lose the trig and forget
accumulating. If it needs to vary over time then the accumulation will
sound much smoother whilst the delay is changing.
_________________
Chien-Wen Cheng's Music:

http://w3.nctu.edu.tw/~u8642524/index.htm

timbre

--------------------------------------------------------------------------------

maybe it helps, if you think about a simple sinusoidal input and
imagine in what way each consecutive input frame differs from the
following (or the past).
you can take a look at the attached file. this is only a quick hack,
and the display is aliasing and might not be very accurate, but
hopefully it gives you the right idea of what is happening.

dividing the sampling rate by the fft size, you get the fft delta-bin
frequency, i.e. the frequency spacing of the fft bins.
if the input signal is exactly a multiple of the delta-bin freq, i.e.
if the input freq is at the center of an fft bin, the input signal is
a perfect multiple of the fft-frame size. that means every fft frame
looks the same, the phase is not moving, so the phase difference is
0.
but as soon as the input signal is not at the center of an fft bin,
the input signal does not fit in the fft frame an integer multiple of
times. so every input frame 'looks different', the phase is moving.
if the input frequency stays constant, the phase difference will also
be constant - but not 0!
in order to resynthesize the correct frequency, you'd have to account
for this moving phase, i.e. you'd have to add a constant phase-offset
for every fft frame -> accumulate the phase deltas...
don't know if that clears it up, but take a look at the attachment.

in fact this is an attempt of a simplified explanation of why you
have to deal with phase differences and not with actual phase values.
in reality it is a little more complex as usual...

#P window setfont "Sans Serif" 9.;
#P window linecount 1;
#P hidden newex 337 69 40 196617 t 5 0 b;
#P hidden newex 337 44 48 196617 loadbang;
#P window linecount 3;
#P comment 9 250 50 196617 calculate phase deltas;
#P window linecount 1;
#P comment 25 534 100 196617 phase delta;
#P newex 56 269 64 196617 phasewrap~;
#P comment 19 388 48 196617 + &sup1; --;
#P comment 19 517 48 196617 - &sup1; --;
#P flonum 56 347 60 9 0 0 0 3 0 0 0 221 221 221 222 222 222 0 0 0;
#P newex 56 298 39 196617 sah~;
#P newex 56 320 70 196617 snapshot~ 11;
#P user multiSlider 56 390 12 135 -3.2 3.2 1 2681 15 0 0 2 0 0 0;
#M frgb 0 0 0;
#M brgb 255 255 255;
#M rgb2 127 127 127;
#M rgb3 0 0 0;
#M rgb4 37 52 91;
#M rgb5 74 105 182;
#M rgb6 112 158 18;
#M rgb7 149 211 110;
#M rgb8 187 9 201;
#M rgb9 224 62 37;
#M rgb10 7 114 128;
#P comment 18 453 48 196617 -- 0 --;
#P newex 56 250 65 196617 framedelta~;
#P comment 103 387 48 196617 + &sup1; --;
#P comment 103 516 48 196617 - &sup1; --;
#P flonum 140 347 60 9 0 0 0 3 0 0 0 221 221 221 222 222 222 0 0 0;
#P flonum 224 129 45 9 0 0 0 3 0 0 0 255 227 23 222 222 222 0 0 0;
#N vpatcher 20 74 350 350;
#P window setfont "Sans Serif" 9.;
#P number 130 138 35 9 0 0 0 3 0 0 0 221 221 221 222 222 222 0 0 0;
#P newex 50 96 27 196617 i;
#P newex 50 74 34 196617 + 0.5;
#P inlet 50 30 15 0;
#P outlet 50 148 16 0;
#P connect 1 0 2 0;
#P connect 2 0 3 0;
#P connect 3 0 0 0;
#P connect 3 0 4 0;
#P pop;
#P newobj 186 226 43 196617 p round;
#P newex 169 249 27 196617 ==~;
#P newex 140 298 39 196617 sah~;
#P newex 140 320 70 196617 snapshot~ 11;
#P user multiSlider 140 389 12 135 -3.2 3.2 1 2681 15 0 0 2 0 0 0;
#M frgb 0 0 0;
#M brgb 255 255 255;
#M rgb2 127 127 127;
#M rgb3 0 0 0;
#M rgb4 37 52 91;
#M rgb5 74 105 182;
#M rgb6 112 158 18;
#M rgb7 149 211 110;
#M rgb8 187 9 201;
#M rgb9 224 62 37;
#M rgb10 7 114 128;
#P newex 255 228 45 196617 poke~ x;
#N vpatcher 20 74 409 376;
#P window setfont "Sans Serif" 9.;
#P newex 87 68 54 196617 dspstate~;
#P comment 162 129 100 196617 fft-size;
#P comment 162 103 100 196617 sampling rate;
#P flonum 101 154 56 9 0 0 0 3 0 0 0 221 221 221 222 222 222 0 0 0;
#P newex 101 126 40 196617 / 256.;
#P flonum 101 101 56 9 0 0 0 3 0 0 0 221 221 221 222 222 222 0 0 0;
#P comment 162 156 100 196617 delta-bin frequency;
#P outlet 101 190 15 0;
#P connect 7 1 2 0;
#P connect 2 0 3 0;
#P connect 3 0 4 0;
#P connect 4 0 0 0;
#P pop;
#P newobj 159 75 80 196617 p delta-bin freq;
#P newex 548 435 70 196617 buffer~ x 5.8;
#P message 548 387 32 196617 set x;
#P user waveform~ 186 387 363 137 3 9;
#W mode select;
#W mouseoutput none;
#W clipdraw 1;
#W unit samples;
#W grid 22.675737;
#W ticks 0;
#W labels 1;
#W vlabels 0;
#W vticks 0;
#W bpm 120. 4.;
#W frgb 33 0 0;
#W brgb 60 178 173;
#W rgb2 0 95 255;
#W rgb3 0 0 0;
#W rgb4 0 0 0;
#W rgb5 146 179 217;
#W rgb6 100 100 100;
#W rgb7 100 100 100;
#P user ezdac~ 609 89 653 122 0;
#P newex 97 214 53 196617 cartopol~;
#P flonum 97 56 45 9 0 0 0 3 0 0 0 40 204 140 222 222 222 0 0 0;
#P newex 97 103 72 196617 * 1.;
#P flonum 97 128 56 9 0 0 0 3 0 0 0 221 221 221 222 222 222 0 0 0;
#P newex 97 155 40 196617 cycle~;
#P newex 97 186 82 196617 fft~ 256 256 0;
#P comment 95 42 100 196617 bin number;
#P comment 156 129 28 196617 Hz;
#P comment 102 452 48 196617 -- 0 --;
#P comment 272 131 100 196617 phase ( 0. - 1. );
#P comment 302 230 137 196617 record time domain input;
#P comment 99 534 100 196617 actual phase value;
#P window linecount 2;
#P comment 47 207 50 196617 calculate phase;
#P window linecount 1;
#P comment 184 300 143 196617 s&h fft bin of interest;
#P comment 301 369 182 196617 time-domain input signal;
#P window linecount 2;
#P comment 239 531 239 196617 if the instantaneous phase of the input
frame is 1. (cosine) the fft will calculate a phase value of 0.;
#P window linecount 1;
#P comment 507 598 177 196617 volker bšhm vbo...@gmx.ch;
#P fasten 16 1 32 0 145 240 61 240;
#P connect 32 0 40 0;
#P connect 40 0 36 0;
#P connect 36 0 35 0;
#P connect 35 0 37 0;
#P connect 37 0 34 0;
#P fasten 26 0 36 1 174 291 90 291;
#P hidden connect 44 0 15 0;
#P connect 15 0 14 0;
#P connect 14 0 13 0;
#P connect 13 0 12 0;
#P connect 12 0 11 0;
#P connect 11 0 16 0;
#P fasten 28 0 12 1 229 151 132 151;
#P connect 11 1 16 1;
#P connect 16 1 25 0;
#P connect 25 0 24 0;
#P connect 24 0 29 0;
#P connect 29 0 23 0;
#P connect 21 0 14 1;
#P connect 11 2 26 0;
#P connect 26 0 25 1;
#P fasten 15 0 27 0 102 98 191 98;
#P connect 27 0 26 1;
#P hidden connect 19 0 18 0;
#P hidden connect 44 1 28 0;
#P fasten 12 0 22 0 102 177 260 177;
#P fasten 11 2 22 1 174 207 277 207;
#P hidden connect 43 0 44 0;
#P hidden connect 44 2 19 0;
#P window clipboard copycount 45;

timbre

--------------------------------------------------------------------------------

frameaccum~ computes 'running phase' by adding an fft frame to the
previous frame as a vector (i.e. it adds the first sample of the first
frame to the first sample of the last frame, the second sample of the
first to the second sample of the last, etc.)... you can conceptualize
frameaccum~, framedelta~, and vectral~ as signal-rate equivalents of
doing math with vexpr. it puts out a vector with all of these sums in
the same order they originally came in. you need frameaccum~ when
you're building phase vocoders to prevent glitches when you read fft
analysis frames out of order (that's why in the phase vocoder examples
you record the difference between phases rather than the phases
themselves into the buffer~ objects). best. /luke

lmj0316

--------------------------------------------------------------------------------

+pi。“解码”的时候再通过frameaccum~把当前vector以前的所有delta值累加起来，恢复到phase。它們同時應用到本例，正因

timbre

--------------------------------------------------------------------------------

lmj0316 寫到:

FFT 主要最常用在 time-stretch （不改變音高）, transpose （不改變時間）兩種功能，不過跟 granular 很像的