Discrete Fourier Transform (eeks)…

So i was having an hour or so today available to waste on the audio thing i discussed earlier. I implemented a simple WAV loader that gives me 16-bit mono or stereo PCM data. For now i stick to mono as it makes my life a bit easier on the Fourier side of things. The goal is to extract 1024 samples from the mono 16-bit PCM stream and do an FFT on those samples, process them and then move on to the next 1024.

What i get out of the FFT is 1024 coefficients, 512 for the cosinus sinoids and 512 for the sine sinoids that can recover the original PCM signal when using them with an inverse discrete Fourier transform. The 512 coefficients are each for a specific frequency in the range 0-22khz (half the sampling rate of 44100), with the first coefficient being the average over all the others, so this one can be left out (as far as i understood it). Coefficient 1 (indexing starts at 0) stands for frequency 22050 / 512 = 43,06Hz, coefficient 2 stands for frequency ~86Hz and so on. So far so good.

Now i want to calculate the amplitude of each frequency bin (cosine and sine combined) which is given by

amp[i] = Math.sqrt(re[i] * re[i] + im[i]*im[i]);

where i is the index of a frequency, re[i] stands for the ith cosine coefficient and im[i] stands for the ith sine coefficient (as the FFT produces complex numbers the coefficient arrays or often refered to by those names in the literature, don’t get confused).

Now my problem is that i have no idea on what scale those amplitudes are. It’s probably irrelevant for what i have in mind (creating sub bands by concatenating and averaging the amplitudes for beat detection as described here), but it would still be nice to know what exactly i can expect. From what i can tell it is unimportant wheter the original sample data is 16-bit PCM or 32-bit float in the range [-1,1], the coefficients will scale. I’m a little bit uneasy about the Math.sqrt() call i need to get the proper euclidian amplitude (I’m pretty sure it’s not called euclidian :) ), that sucker will cost me quiet some processing power on a mobile device. There are fixed point FFTs around but i don’t want to open that can of worms either.

Anyways, i succeeded in applying the FFT to streamed audio and the results which i visualized seem to be correct when compared to what audacity’s spectral analyzer shows me. Next time i should already have the sub band thing going. I also have to come up with a way to synchronize the FFT calculation with actual audio playback for testing purposes as it will easily allow me to see where in the spectrum i should focus on to detect beats.

If you don’t want to implement the FFT yourself you can have a look at JTransforms which is a Java FFT library that performs reasonably well.

If any of you readers out there have experience with audio signal processing drop me a line, maybe you can give me some hints on the questions i have.

  • Share/Bookmark
 

Leave a Reply