A lot of myths surround the argument about fixed versus floating point arithmetic on Android devices. All currently available Android phones are missing an FPU so floating point arithmetic must be emulated in software and is therefor expected to be quiet slow compared to fixed point arithmetic which is solely based on integers.
Another choice one has to make when writting a game for Android is wheter to go Java or C for the performance intensive parts of the game (or even for the whole game). Legend has it that C will be up to 10 times faster than an equivalent Java implementation.
To finally get some insight on this two topics i wrote a micro benchmark. In my previous post on MD2 model rendering i talked about interpolating between two keyframes a little. Back then i used floats and Java for the interpolation code. For this benchmark i rewrote the interpolation code to use fixed point and also ported the fixed point and floating point code over to C. The Java interpolation routines use float arrays, the C equivalents use direct buffers for obvious reasons. Here’s the code for the Java methods:
public void interpolate( float[] src, float[] dst, float[] out, float alpha ) { for( int i = 0; i < src.length; i++ ) { float s = src[i]; out[i] = s + (dst[i] - s) * alpha; } } public void interpolateFP( int[] src, int[] dst, int[] out, int alpha ) { for( int i = 0; i < src.length; i++ ) { int s = src[i]; out[i] = s + ((dst[i] - s) * alpha) >> 16; } }
The c equivalents look like this:
JNIEXPORT void JNICALL Java_com_badlogic_nativebenchmarks_Native_interpolate(JNIEnv *env, jobject obj, jobject sourceBuffer, jobject destBuffer, jobject targetBuffer, jfloat alpha, jint count ) { jfloat* src = (jfloat*)env->GetDirectBufferAddress(sourceBuffer); jfloat* dst = (jfloat*)env->GetDirectBufferAddress(destBuffer); jfloat* tgt = (jfloat*)env->GetDirectBufferAddress(targetBuffer); for( int i = 0; i < count; i++ ) tgt[i] = src[i] + (dst[i] - src[i]) * alpha; } JNIEXPORT void JNICALL Java_com_badlogic_nativebenchmarks_Native_interpolateFixedPoint(JNIEnv *env, jobject obj, jobject sourceBuffer, jobject destBuffer, jobject targetBuffer, jint alpha, jint count) { jint* src = (jint*)env->GetDirectBufferAddress(sourceBuffer); jint* dst = (jint*)env->GetDirectBufferAddress(destBuffer); jint* tgt = (jint*)env->GetDirectBufferAddress(targetBuffer); for( int i = 0; i < count; i++ ) tgt[i] = src[i] + ((dst[i] - src[i]) * alpha) >> 16; }
I benchmark each of the 4 methods serperately, calling them 100000 times on the same input. Each method call processes an array/buffer with 700 floats/ints. And here are the interesting results:
Java Float: 31.28 seconds
C Float: 8.78 seconds
Java Fixed Point: 18.93 seconds
C Fixed Point: 2.23 seconds
All measurements are averaged over 10 runs.
Interestingly enough, the C float method is only approx. 3 times faster than the equivalent Java version. The Java fixed point version is only 1/3 faster than the Java float version indicating that it’s probably not worth using fixed point in a pure Java implementation. The c fixed point method outperforms all other methods by a comfortable margin. It’s 10 times faster than the equivalent Java version and 15 times faster than the Java float version. It’s also 4 times faster than the C float version indicating that the software float emulation is quiet a bit slower.
The interesting question is why the Java fixed point version is not that much faster than its Java float pendant. I suspect that the shift is not optimal.
To summarize: use C with fixed point arithmetic for maximum performance. If you want to stay with Java use floats over fixed point for maintainability.
Get the full source here. To recompile the C code you have to add an app to your ndk.
Yo, I’m quite interested in this benchmark. What hardware did you run this on? As far as I can tell, some of the hardware has an FPU and others do not. Furthermore, are there any differences between floating and fixed arithmetic in OpenGL ES on the android? I found a couple other things on google which suggest that GLES is implemented in floating point arithmetic, but nobody has any answers.
I realize its been a while since you’ve performed these benchmarks, but it’d be great if you could investigate this a bit further. I’d be willing to help out. I have a Moto.Droid.
Thanks for the information,
-Griff
The benchmark was performed on a Motorola Milestone which sports an FPU. However, i can’t remember whether i compiled to thumb (then floating point ops get emulated in software) or to arm. You can download the source for the benchmark from the link given above and play around with it. If you have any new results i’d be great if you share them here.
I haven’t found any difference in performance when using fixed point with OpenGL ES on my Hero as well as on my Milestone.
Great benchmark! This will be interesting to run on the JNI compiler with appropriate warm-up as well. Also, I saw that the Java for-loop retrieves the array length on every loop step. In all fairness, you might want to add it to the method parameters instead, like in the native method call.
i did so in preliminary tests and it didn’t seem to have a noticeable impact on performance at least on my Droid. The matter might be different on the Hero and other first-gen devices. If you happen to benchmark a bit more please report your results back here!
[...] project and run on a PC. There are some interesting differences that need to be considered like the lack of floating point support in most Android hardware. As a hobby I’m going to explore more interesting effects on the [...]