Light Racer 2.0 - Days 32-33 - Getting Great Game Performance

Choppiness in game play can be an absolute make or break deal for the success of any game. I never thought that the first Light Racer would get downloaded and played as many times as it has. If I had realized that then, I probably would have invested a few more days smoothing it out. All of that is water under the bridge now though because the last 2 days of work have brought Light Racer 2 up between 45 and 60 frames per second on the G1 and no stutter in game from garbage collection or resource management. I covered all of the issues I faced and provided some solutions. This will probably apply to many other games as well.

Day 32 - What was done:

Added tilt sensing. I'm not happy with how it's working right now. I have the basic trig working for sensing the tilt of the two axis I'd like to use and I'm keeping a running average of how the user is holding the device, but it's still not working as well as I thought it would. This game isn't really well suited to tilt-control but since so many people think it's cool, I'm going to try to get it working well enough.

I did more performance optimization and the game is now running at nearly a full 60FPS on my G1. I'm having a few issues with garbage collection causing 240ms lags and I have an idea of what's causing it. I did as much as possible to ensure that no new memory was allocated after game initialization. Unfortunately there are a few arrays and a lot of Coordinate objects that are still being created in the loop. Once I find a way to pool those resources, the game should be free of GC jitters.

So how did I get from 20FPS to 60FPS in a few days time without cutting any features?

Changing the background bitmap to RGB_565 helped tremendously. That alone got me to 30-35fps. I was still drawing a full surface layer in ARGB_8888 that the trails were drawn on to the main canvas. I realized that if I could somehow get rid of that really expensive draw, I'd probably gain another 20fps. I thought about it and realized that if I keep a clean background bitmap at RGB_565 and a dynamic one at RGB_565 which has the same bg and also gets trails on it, which is "wiped" by copying the clean one to it, that it would be far more CPU efficient. I switched out the code and I was right. Now I'm drawing a dynamic background image at RGB_565 and then the individual sprites.

I optimized all of my floating point math to use FloatMath and then a few methods like my fast float atan2, toDegrees and toRadians. I now use no doubles anywhere in the code and while my arc tangents aren't totally accurate, you would never know that when playing the game. Accuracy isn't as important as the speed.

I also went through all of the code that runs in the update and draw loop and found ways to reuse objects instead of allocating new ones. For example, I was creating new Matrixes for each player for every draw call. Now each player has a member Matrix and each time it is simply reset().

The big changes probably doubled my framerate but all of those little changes have also helped to reduce CPU usage and keep the game super smooth. Besides the delay when GC is running, which I will solve, the game is very smooth now - much better than Light Racer 1. My FPS counter shows around 60FPS for most of the game play time, even with 2 AI players running path finders on a map with many animated obstacles. That's the level of playability I've been working towards with all of these days of optimizations. I want to make sure that when people get the game, it's so smooth that it seems like the phone was designed specifically to make it play that way.

Day 33 - What was done:

Started using DDMS to track allocations and figure out why I'm seeing so much garbage collection. http://android-developers.blogspot.com/2009/02/track-memory-allocations....

Found that canvas.getMatrix() allocates a new Matrix if there isn't one. This wasn't a big deal but I think it's strange that Canvas makes the assumption that a null matrix and a new matrix are the same thing. For performance sake, I'd prefer if it returned null if no matrix is set. Every class I have that uses a matrix to draw has just a single field called reusableMatrix. I used the same thing here to fix this.

Also found that the text I'm drawing on the info header is causing tons of new char[] and Strings. I put in some serious optimizations to stop that. The bigger problem is that doing something simple like Canvas.drawText(score + scoreLabel, x, y, paint) is crazy expensive in terms of allocations. Here's how that code actually executes:

1) score needs to be a String so Integer.toString(score) is implicitly called.
2) Integer.toString(int) creates a new char[] and a new String and returns it.
3) A new StringBuffer() is created to concatenate the score string and scoreLabel string.
4) StringBuffer.toString() is then called which creates another new String()

Let's count this up. For EACH call (every frame, remember) to drawText(int + string, ...) a new char array, new StringBuffer (with its char array) and 2 new Strings, each with their own char arrays are created. All in all, the garbage collector has to collect 7 short-term objects for this one draw. That may not seem like a lot, but imagine you are drawing 2 of these labels like I was 60 times per second. That's 60 * 14 = 840 objects created per second or 50,400 objects per minute, which will max out Android garbage pile at around 1.7MB, causing a 100ms GC to run and interrupt your game.

The fix isn't _that_ hard but it is a serious optimization which means serious unreadability of the code if you don't understand it. What you need to do to stop this is to create a char[] to hold your stringed integer in. Then create a StringBuffer that you will reuse to do your concatenations. There are other ways of solving this problem, like caching the string if the actual text does not change often, but my scores change like the national debt rises so I have to do it this way.

Ready?

Here's the discussion thread where I worked out what to do with a bunch of other android developers. These guys are very helpful.

This is about as efficient as I will ever care to make such a thing.
If you want to _never_ allocate, you could make a char[][] where the
first dimension is the length of the second dimension arrays. That
would be char[1], char[2], char[3], char[4] and so on. That would
make it so that if your string size were 5 chars, you would say char[]
correctArray = myArrays[5]. I didn't both with that because a few
allocations are ok, just not one every tick of the loop.

This is ugly but if you need something like this, it does work:

I have a class called Util and I put this in it:

/* Most of this code is copied from the Integer class in Java 6 SDK. It's slightly modified here but the original copyrights should apply. */
private final static int[] intSizeTable = { 9, 99, 999, 9999, 99999,
999999, 9999999, 99999999, 999999999,
Integer.MAX_VALUE };

private final static char[] DigitTens = { '0', '0', '0', '0', '0',
'0', '0', '0', '0', '0', '1', '1', '1', '1',
'1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '2', '2', '2',
'2', '2', '2', '3', '3', '3', '3', '3',
'3', '3', '3', '3', '3', '4', '4', '4', '4', '4', '4', '4', '4',
'4', '4', '5', '5', '5', '5', '5', '5',
'5', '5', '5', '5', '6', '6', '6', '6', '6', '6', '6', '6', '6',
'6', '7', '7', '7', '7', '7', '7', '7',
'7', '7', '7', '8', '8', '8', '8', '8', '8', '8', '8', '8', '8',
'9', '9', '9', '9', '9', '9', '9', '9',
'9', '9', };

private final static char[] DigitOnes = { '0', '1', '2', '3', '4',
'5', '6', '7', '8', '9', '0', '1', '2', '3',
'4', '5', '6', '7', '8', '9', '0', '1', '2', '3', '4', '5', '6',
'7', '8', '9', '0', '1', '2', '3', '4',
'5', '6', '7', '8', '9', '0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', '0', '1', '2', '3', '4', '5',
'6', '7', '8', '9', '0', '1', '2', '3', '4', '5', '6', '7', '8',
'9', '0', '1', '2', '3', '4', '5', '6',
'7', '8', '9', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
'0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', };

private final static char[] digits = { '0', '1', '2', '3', '4', '5',
'6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e',
'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r',
's', 't', 'u', 'v', 'w', 'x', 'y', 'z' };

// Requires positive x
private static int stringSize(int x) {
for (int i = 0;; i++) {
if (x <= intSizeTable[i]) {
return i + 1;
}
}
}

public static void getChars(int i, int index, char[] buf) {
if (i == Integer.MIN_VALUE) {
System.arraycopy("-2147483648".toCharArray(), 0, buf, 0,
buf.length);
}
int q, r;
int charPos = index;
char sign = 0;

if (i < 0) {
sign = '-';
i = -i;
}

// Generate two digits per iteration
while (i >= 65536) {
q = i / 100;
// really: r = i - (q * 100);
r = i - ((q << 6) + (q << 5) + (q << 2));
i = q;
buf[--charPos] = DigitOnes[r];
buf[--charPos] = DigitTens[r];
}

// Fall thru to fast mode for smaller numbers
// assert(i <= 65536, i);
for (;;) {
q = (i * 52429) >>> (16 + 3);
r = i - ((q << 3) + (q << 1)); // r = i-(q*10) ...
buf[--charPos] = digits[r];
i = q;
if (i == 0)
break;
}
if (sign != 0) {
buf[--charPos] = sign;
}
}

Then in my game code it looks like this (for my FPS counter):

private char[] fpsChars = new char[2];
private StringBuffer fpsText = new StringBuffer(7);

private void drawFPS(Canvas canvas) {
int fps = this.fps;
int fpsStringLength = Util.getSize(fps);
if (fpsChars.length != fpsStringLength) {
// re-allocate
fpsChars = new char[fpsStringLength];
}
char[] fpsChars = this.fpsChars;
//copy the chars into the array
Util.getChars(fps, fpsStringLength, fpsChars);
StringBuffer fpsText = this.fpsText;
fpsText.delete(0, fpsText.length());
fpsText.append(fpsChars).append(FPS_TEXT);
canvas.drawText(fpsText, 0, fpsText.length(), worldWidth - 60,
worldHeight + INFO_HEADER_HEIGHT - 20,
gameResources.fpsPaint);
}

That is SO MUCH more code than I ever wanted there but it's
ridiculously more efficient than it was before so I'm going to call it
good and move on.

So, with all of this said and done, I now have a game that runs at a great framerate (45-60fps on the G1) and never garbage collects while playing. I'd say that's about all I could ask for. It's finally time to quit messing around with performance and get some new screens done!

Update 8/28/2009 - Timothy F has sent me an easier way to do a counter update with no allocations:

static final char c[] = new c[100];
static final StringBuilder sb = new StringBuilder(100);
private void drawFPS(Canvas canvas) {
sb.setLength(0);
sb.append(fps);
sb.append(FPS_TEXT);
sb.getChars(0, sb.length(), c, 0);
canvas.drawText(c, 0, sb.length(), worldWidth - 60, worldHeight +
INFO_HEADER_HEIGHT - 20, gameResources.fpsPaint);
}

14 Comments

Post a comment here or discuss this and other topics in the forums

Could you explain more in

Could you explain more in detail what you mean by Dynamic background with two RGB_565? -->

"I thought about it and realized that if I keep a clean background bitmap at RGB_565 and a dynamic one at RGB_565 which has the same bg and also gets trails on it, which is "wiped" by copying the clean one to it, that it would be far more CPU efficient. I switched out the code and I was right. Now I'm drawing a dynamic background image at RGB_565 and then the individual sprites."

I am trying to understand what your background consists of so that you could replace 1 background with alpha with two that do not have alpha. I have a problem where my background requires alpha because it needs transparency. I have basically a hole that a ball needs to go into and so that hole is a transparent part of the background so the ball can disappear into it.

That optimization was only

That optimization was only for lightracer and really is only effective for games that are constantly modifying the background. The true background does not include the light trails but it may as well, which is what I ended up doing.

I would have to see more of what you're trying to do to figure out how to optimize. Often times you can look at the problem from a totally different angle to figure out a solution, but a screenshot or demo of your game would be helpful here.

I don't understand this either.

Hey Robert,

Thanks for your helpful articles. I have the same question as the poster above though. I don't understand what you mean. I assume you have:

1) The black and green grid-like background
2) The motorcycle sprite
3) The light trail (??? not sure if this is a bitmap or something else.)

I'm basing this on Light Racer, because I can't find Light Racer 2.0 anywhere.

How does drawing the light trails on one bitmap then copying to another speed anything up?

Thanks!

The current Light Racer on

The current Light Racer on the market is the 2.0 version, it's just not called LR2 or anything :)

So actually there is a full screen bitmap for the grid then another full screen bitmap for the light trails then the game object and racer sprites. Originally I tried drawing new light trails every frame but that was far too expensive using Canvas's 2D drawing routines. I then drew them to their own bitmap, only drawing the little bit that changes every frame instead of a full trail redraw. That resulted in better performance but now I'm drawing two full screens... So the final solution was to draw a clear grid to a scratch bitmap, then update that bitmap with the light trails that change each frame. That scratch bitmap is the only full screen draw every frame, resulting in the best performance.

about GC

Sir, Thanks for your blog. It's great!

For Light Race, there is a touch model to play.

But onTouchEvent(MotionEvent event) will allocates a lot of new MotionEvent Objs. System GC still will block game(very short time).

Do u have any idea about this issue:)

It does not create new

It does not create new MotionEvent objects. It has a pool that it uses and recycles. In an event-based framework like this, you need to assume that that is how it is done and you should not trust each incoming object to be unique and have only a one-time use or you will have issues where the contents are changed on you.

I wrote an article about pipelining input. Check that out if you want to know more about efficient input processing.

it will create float array

Thanks for the reply:)

It does not create new MotionEvent objects, but will create new float array, see following:

656 float[] 3 android.view.MotionEvent
656 float[] 3 android.view.MotionEvent
656 float[] 3 android.view.MotionEvent
656 float[] 3 android.view.MotionEvent
656 float[] 3 android.view.MotionEvent
(there are many more, this is just a snapshot)
It seems that every time a MotionEvent is recorded, a float array is
created?

By the way, I have read your article about pipelining input:
http://www.rbgrn.net/content/342-using-input-pipelines-your-android-game
(It is great!)

That's strange... Is that

That's strange... Is that from the allocation tracker?

Yes, It's from allocation tracker

Yes, It's from allocation tracker of ddms tools. I tested it by touching screen on my G3(hero, rom: 1.5)

I'll have to do some testing

I'll have to do some testing in 2.1 and see if that's fixed in newer versions.

I have been having the same

I have been having the same problem with drawText(),

have tried the new shorter code above but still produces allocations at this line:

sb.append(fps);

which is a shame, as this is the only thing letting my game down :(

Use a char array and place

Use a char array and place individual chars for rapid-changing text. Otherwise, you can cache common strings as char arrays and just place those in.

I just want to say thank you

I just want to say thank you to you. Really a great help for me to start the Android developing with.

Awesome Resource

Just wanted to say what a great resource your blog has been since I started developing an Android game two weeks ago. I have been getting all kinds of great information about optimization from here, and this development journal for Light Racer has been really useful to look at. Really thankful you took the time to post it.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.