Light Racer 2.0 - Days 32-33 - Getting Great Game Performance

Android

Choppiness in game play can be an absolute make or break deal for the success of any game. I never thought that the first Light Racer would get downloaded and played as many times as it has. If I had realized that then, I probably would have invested a few more days smoothing it out. All of that is water under the bridge now though because the last 2 days of work have brought Light Racer 2 up between 45 and 60 frames per second on the G1 and no stutter in game from garbage collection or resource management. I covered all of the issues I faced and provided some solutions. This will probably apply to many other games as well.

Day 32 - What was done:

Added tilt sensing. I'm not happy with how it's working right now. I have the basic trig working for sensing the tilt of the two axis I'd like to use and I'm keeping a running average of how the user is holding the device, but it's still not working as well as I thought it would. This game isn't really well suited to tilt-control but since so many people think it's cool, I'm going to try to get it working well enough.

I did more performance optimization and the game is now running at nearly a full 60FPS on my G1. I'm having a few issues with garbage collection causing 240ms lags and I have an idea of what's causing it. I did as much as possible to ensure that no new memory was allocated after game initialization. Unfortunately there are a few arrays and a lot of Coordinate objects that are still being created in the loop. Once I find a way to pool those resources, the game should be free of GC jitters.

So how did I get from 20FPS to 60FPS in a few days time without cutting any features?

Changing the background bitmap to RGB_565 helped tremendously. That alone got me to 30-35fps. I was still drawing a full surface layer in ARGB_8888 that the trails were drawn on to the main canvas. I realized that if I could somehow get rid of that really expensive draw, I'd probably gain another 20fps. I thought about it and realized that if I keep a clean background bitmap at RGB_565 and a dynamic one at RGB_565 which has the same bg and also gets trails on it, which is "wiped" by copying the clean one to it, that it would be far more CPU efficient. I switched out the code and I was right. Now I'm drawing a dynamic background image at RGB_565 and then the individual sprites.

I optimized all of my floating point math to use FloatMath and then a few methods like my fast float atan2, toDegrees and toRadians. I now use no doubles anywhere in the code and while my arc tangents aren't totally accurate, you would never know that when playing the game. Accuracy isn't as important as the speed.

I also went through all of the code that runs in the update and draw loop and found ways to reuse objects instead of allocating new ones. For example, I was creating new Matrixes for each player for every draw call. Now each player has a member Matrix and each time it is simply reset().

The big changes probably doubled my framerate but all of those little changes have also helped to reduce CPU usage and keep the game super smooth. Besides the delay when GC is running, which I will solve, the game is very smooth now - much better than Light Racer 1. My FPS counter shows around 60FPS for most of the game play time, even with 2 AI players running path finders on a map with many animated obstacles. That's the level of playability I've been working towards with all of these days of optimizations. I want to make sure that when people get the game, it's so smooth that it seems like the phone was designed specifically to make it play that way.

Day 33 - What was done:

Started using DDMS to track allocations and figure out why I'm seeing so much garbage collection. http://android-developers.blogspot.com/2009/02/track-memory-allocations....

Found that canvas.getMatrix() allocates a new Matrix if there isn't one. This wasn't a big deal but I think it's strange that Canvas makes the assumption that a null matrix and a new matrix are the same thing. For performance sake, I'd prefer if it returned null if no matrix is set. Every class I have that uses a matrix to draw has just a single field called reusableMatrix. I used the same thing here to fix this.

Also found that the text I'm drawing on the info header is causing tons of new char[] and Strings. I put in some serious optimizations to stop that. The bigger problem is that doing something simple like Canvas.drawText(score + scoreLabel, x, y, paint) is crazy expensive in terms of allocations. Here's how that code actually executes:

1) score needs to be a String so Integer.toString(score) is implicitly called.
2) Integer.toString(int) creates a new char[] and a new String and returns it.
3) A new StringBuffer() is created to concatenate the score string and scoreLabel string.
4) StringBuffer.toString() is then called which creates another new String()

Let's count this up. For EACH call (every frame, remember) to drawText(int + string, ...) a new char array, new StringBuffer (with its char array) and 2 new Strings, each with their own char arrays are created. All in all, the garbage collector has to collect 7 short-term objects for this one draw. That may not seem like a lot, but imagine you are drawing 2 of these labels like I was 60 times per second. That's 60 * 14 = 840 objects created per second or 50,400 objects per minute, which will max out Android garbage pile at around 1.7MB, causing a 100ms GC to run and interrupt your game.

The fix isn't _that_ hard but it is a serious optimization which means serious unreadability of the code if you don't understand it. What you need to do to stop this is to create a char[] to hold your stringed integer in. Then create a StringBuffer that you will reuse to do your concatenations. There are other ways of solving this problem, like caching the string if the actual text does not change often, but my scores change like the national debt rises so I have to do it this way.

Ready?

Here's the discussion thread where I worked out what to do with a bunch of other android developers. These guys are very helpful.

This is about as efficient as I will ever care to make such a thing.
If you want to _never_ allocate, you could make a char[][] where the
first dimension is the length of the second dimension arrays. That
would be char[1], char[2], char[3], char[4] and so on. That would
make it so that if your string size were 5 chars, you would say char[]
correctArray = myArrays[5]. I didn't both with that because a few
allocations are ok, just not one every tick of the loop.

This is ugly but if you need something like this, it does work:

I have a class called Util and I put this in it:

/* Most of this code is copied from the Integer class in Java 6 SDK. It's slightly modified here but the original copyrights should apply. */
private final static int[] intSizeTable = { 9, 99, 999, 9999, 99999,
999999, 9999999, 99999999, 999999999,
Integer.MAX_VALUE };

private final static char[] DigitTens = { '0', '0', '0', '0', '0',
'0', '0', '0', '0', '0', '1', '1', '1', '1',
'1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '2', '2', '2',
'2', '2', '2', '3', '3', '3', '3', '3',
'3', '3', '3', '3', '3', '4', '4', '4', '4', '4', '4', '4', '4',
'4', '4', '5', '5', '5', '5', '5', '5',
'5', '5', '5', '5', '6', '6', '6', '6', '6', '6', '6', '6', '6',
'6', '7', '7', '7', '7', '7', '7', '7',
'7', '7', '7', '8', '8', '8', '8', '8', '8', '8', '8', '8', '8',
'9', '9', '9', '9', '9', '9', '9', '9',
'9', '9', };

private final static char[] DigitOnes = { '0', '1', '2', '3', '4',
'5', '6', '7', '8', '9', '0', '1', '2', '3',
'4', '5', '6', '7', '8', '9', '0', '1', '2', '3', '4', '5', '6',
'7', '8', '9', '0', '1', '2', '3', '4',
'5', '6', '7', '8', '9', '0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', '0', '1', '2', '3', '4', '5',
'6', '7', '8', '9', '0', '1', '2', '3', '4', '5', '6', '7', '8',
'9', '0', '1', '2', '3', '4', '5', '6',
'7', '8', '9', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
'0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', };

private final static char[] digits = { '0', '1', '2', '3', '4', '5',
'6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e',
'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r',
's', 't', 'u', 'v', 'w', 'x', 'y', 'z' };

// Requires positive x
private static int stringSize(int x) {
for (int i = 0;; i++) {
if (x <= intSizeTable[i]) {
return i + 1;
}
}
}

public static void getChars(int i, int index, char[] buf) {
if (i == Integer.MIN_VALUE) {
System.arraycopy("-2147483648".toCharArray(), 0, buf, 0,
buf.length);
}
int q, r;
int charPos = index;
char sign = 0;

if (i < 0) {
sign = '-';
i = -i;
}

// Generate two digits per iteration
while (i >= 65536) {
q = i / 100;
// really: r = i - (q * 100);
r = i - ((q << 6) + (q << 5) + (q << 2));
i = q;
buf[--charPos] = DigitOnes[r];
buf[--charPos] = DigitTens[r];
}

// Fall thru to fast mode for smaller numbers
// assert(i <= 65536, i);
for (;;) {
q = (i * 52429) >>> (16 + 3);
r = i - ((q << 3) + (q << 1)); // r = i-(q*10) ...
buf[--charPos] = digits[r];
i = q;
if (i == 0)
break;
}
if (sign != 0) {
buf[--charPos] = sign;
}
}

Then in my game code it looks like this (for my FPS counter):

private char[] fpsChars = new char[2];
private StringBuffer fpsText = new StringBuffer(7);

private void drawFPS(Canvas canvas) {
int fps = this.fps;
int fpsStringLength = Util.getSize(fps);
if (fpsChars.length != fpsStringLength) {
// re-allocate
fpsChars = new char[fpsStringLength];
}
char[] fpsChars = this.fpsChars;
//copy the chars into the array
Util.getChars(fps, fpsStringLength, fpsChars);
StringBuffer fpsText = this.fpsText;
fpsText.delete(0, fpsText.length());
fpsText.append(fpsChars).append(FPS_TEXT);
canvas.drawText(fpsText, 0, fpsText.length(), worldWidth - 60,
worldHeight + INFO_HEADER_HEIGHT - 20,
gameResources.fpsPaint);
}

That is SO MUCH more code than I ever wanted there but it's
ridiculously more efficient than it was before so I'm going to call it
good and move on.

So, with all of this said and done, I now have a game that runs at a great framerate (45-60fps on the G1) and never garbage collects while playing. I'd say that's about all I could ask for. It's finally time to quit messing around with performance and get some new screens done!

Update 8/28/2009 - Timothy F has sent me an easier way to do a counter update with no allocations:

static final char c[] = new c[100];
static final StringBuilder sb = new StringBuilder(100);
private void drawFPS(Canvas canvas) {
sb.setLength(0);
sb.append(fps);
sb.append(FPS_TEXT);
sb.getChars(0, sb.length(), c, 0);
canvas.drawText(c, 0, sb.length(), worldWidth - 60, worldHeight +
INFO_HEADER_HEIGHT - 20, gameResources.fpsPaint);
}