I was playing Audiosurf today after seeing that it didn’t get ported to the Iphone yet. The reason? Apple does not expose access to the music files directly in it’s latest SDK. On Android the Situation is quiet similar. You have access to the SD Card and to any mp3 files that can be found there but you don’t have access to the decoders for mp3/ogg/what have you. No simple way to get PCM data unless you port libmad and similar decoders for ogg vorbis to android. Granted, that might not be a lot of work but it would be nice to have that functionality out of the box. Feeding PCM to the audio processor is at least easy with the AudioTrack class.
I wasn’t all that impressed with Audiosurf’s way of visualizing Schism by Tool, maybe it was lousy cause i only tried the demo version. I became interested in how the tracks might get constructed in Audiosurf and poked around the intrawebs a bit. One part of the equation is beat detection or tracking which i researched a bit today. There’s not a lot of information on the web about this topic that is usefull to a non-mathematically inclined programmer like me. I know my way around simple DSP topics, but FT is as much as i understand.
One article that comes up a lot is the one written by Frederic Patin over at gamedev.net. It describes a couple of algorithms to detect beats. The simplest of the algorithms calculates the “instant” energy of a single frame of n successive samples (let’s call those a frame) and compares it to the average energy of a second worth of frames before that frame. If the energy of the current frame exceeds this average times some constant a beat has been found.
The problem with this approach is that it takes into account the whole frequency spectrum at once, so you get the combined energy of say your bass, guitar, drums and vocals all at once. This simple algorithm might work for electronic music where the rythm section is more pronounced but will fail for other genres such as rock or folk music.
To overcome this problem a second algorithm is proposed that computes the energies for sub-bands of the spectrum instead of the whole spectrum. For this the time-domain signal has to be transformed to the frequency domain, which is usually accomplished by a Fourier Transform. You basically repeat the process in the first algorithm for each sub-band seperately. This way you can differentiate (more or less) between different octaves and thus instruments. Well, only to some extend. The article proposes to split the sub-bands up linearly which should alert any music afficiando immediatly. Given a sample rate of 44100hz your FT will be able to give you the magnitudes of the frequencies from 0 to 22050hz (Nyquist-Shannon theorem, you only get half the sampling rate). Now, most of the interesting stuff will happen between 0 and 6000hz. It thus makes more sense to chose your sub-bands in a non-linear fashion.
One way to do is is to use octaves as explained in this fine blog entry. This way of sub-band layout makes a lot more sense from a musical point of view. The article comes with a nice Processing applet that shows the effects of the linear sub-banding versus the log sub-banding. The beat is much more easily detected in the later.
An implementation of Patin’s two algorithms can be found here. It’s again done with Processing, but given it’s Java nature it shouldn’t be hard to follow.
The final puzzle of the beat detection is the Fourier Transform itself. There’s tons of information on that topic out there and my math skills are not good enough for a full explanation of the gory details. I therefor give you this two nice links to the FT and FFT chapters of the awesome book “The Scientist and Engineer’s Guide to Digital Signal Processing”. I read those a few months ago and used them to brush up my then aquired knowledge a bit today. I suggest to read all the chapters if you really want to get an insight into digital signal processing, it’s an awesome and actually easy read (if you are not afraid of a little math).
Finally, a nice little Java class for the FFT can be found here. It has no external dependancies and is GPL2 licensed.
So what’s the attack plan? Apart from the work on Newton levels (which is tedious, i suck at creating levels…), the prototyping of the pony/bunny game mentioned a couple of posts earlier, a potential book deal (yay! more work) and my day job i might find some free minutes too actually try to get a sub-band based beat detector going in plain Java, using the java sound API which i’m pretty familiar with to get my precious PCM samples. If that works out well i will consider putting the bunny game on the backburner and focus on a rythm racing game (no stupid block stacking…). This would mean that i’d have to port at least libmad and the vorbis lib to android as well as get the FFT and beat detection code ported to C (not a big deal, the code is rather small). Generating tracks from the beat detection results as well as some other analysis shouldn’t actually be a big challenge. Imagine a rythm racing game on your android where you can use your personal music library
I should really learn to focus. Really, really.
My Google analytics tells me that there’s a constant stream of 10 visitors per day reading this blog (given it’s age that’s pretty cool i think). I urge you to take part in this site by leaving comments/suggestions/hate messages. Tell us your thoughts and ideas, we love interacting with you!
Love the idea of a mobile Audiosurf style game. Love the blog too. Keep up the good work!
thanks for your kind comment. We’ll do our best to make a mobile audiosurf style game possible on android (lots and lots of obstacles there
).