Pitch shifting ed

Some adventure story of writing a pitch shifter. With incorrect, but interesting ideas.

Naive approach ed

Task (almost)
All frequency components \(f\) of a signal need to be moved to a different frequency \(f' = f \cdot K\) for a constant ratio \(K\).

As with the equalizers, this screams to be solved via FFT:

Mild obstacles ed

First, we notice, that most new array locations \(i \cdot K\) don't fall on integer indices. When transforming a whole 5 min song, we can easily round up or down without any audible problems. Also multiple components mapped to the same location can be summed up.

Things get complicated, when the signal comes in small chunks (\(\le\) 1024 samples). Suddenly, rounding introduces audible detuning and harmonic changes.

Simple fix ed

If the new index \(i \cdot K\) falls between two integers \(j\) and \(j+1\), add the component to both array locations with weights \((\frac{1}{2}, \frac{1}{2})\). (smoothly going to weights \((1,0)\) if we hit an integer index).

This is still bad! Now, a single sine wave input will create dissonant output when spread over two close frequencies.

Fancier version, interesting failure ed

Actually, what is the perfect output of a sine wave input?

Let's say the input frequency is one of the FFT's frequencies. Then after the FFT, we get an array \((\dots,0,0,z,0,0,\dots)\) with a single entry.

Of course, the output should again be a sine wave, but not hitting an FFT frequency, so the FFT'ed output will be a very messy array. With some patience, we can compute the components of this array as a sinc function, with its peak at the location \(i \cdot K\) but some decaying mess everywhere.

PICTURE!

So, to create a pitch shifter, we need to take each component in the FFT array, use it to scale a sinc-function, sum it all up (and FFT back). This could be done with a matrix. An approximation would just interpolate between more neighbours around the index \(i \cdot K\).

But...

Categories: Blog, Audio programming