carcophony tech postmortem
pre-calculate as much data
Carcophony was the first Xbox Indie game by GLPeas. It was in play testing for four weeks and passed review on first submission.
We outline here some technical problems and solutions we encountered with the game and the submission/play testing process.
Note: by “play testing” we mean XNA creator club play testing. We did fair amount of game play testing with friends and family during development to refine the design and mechanics of the game but that’s not what we are focusing on in this article.
We had a few technical goals when we started:
• Simple graphics and focus on game play
• Solid experience - we wanted to run at 60fps but have a decent amount of cars on the roads as well as doing other tasks in the background
• Get used to the XNA development framework and process - we had limited experience with C# and XNA as a platform
We wanted to handle a large number of cars on our maps. We realised that every car running a complex AI and path finding algorithms, even on a small road network as ours, could be a problem. For example in competitive “versus” mode there could be about 500 cars in play in total on both maps.
So, to avoid complex calculations being done by every car each frame, we pre-calculate a bit of data for our road networks:
• every junction stores information about which road leads to the shortest route to a certain destination; which road is the second best and so on; every time the road network changes, for example a “no entry” sign appears, we recalculate the data
• all road links are splines but they have pre-calculated parameterisation by length stored within them to make car travel calculations faster
Having all this data means we don’t have to compute the shortest route to the car exit – it’s already there for us to pick up. We also did fair amount of common sense tricks, like sorting the cars on the roads in order so we know which one is first, only processing path-finding information for cars that are about to go through a junction.
We thought memory allocation seemed to be the biggest challenge developing games with solid frame rate for the XNA Game Studio. Reducing number and size of memory allocations, especially during main game play, was key to avoid performance penalties.
We used the CLI Profiler throughout development together with the XNA Framework Remote Performance Monitor . Keep in mind that it can be a struggle to find your way around the allocation graph if your game allocates a lot of data during loading, but once you get used to it, it’s a very powerful tool.
During normal gameplay Carcophony allocates very little managed memory. Before the start of a map, after data is loaded, we force the garbage collector to do its job by calling GC.Collect(). That gives us a clean start as the GC would only do a collect after we have generated about a megabyte of garbage. 
We used what we call “memory pools”. A simple “generics” template PoolAllocator class we wrote that can be instantiated with the type needed for allocation.
PoolAllocator<Entity> Pool = new PoolAllocator<Entity>();
Then all allocations for that type would look like:
Entity entity = Entity.Pool.Allocate();
entity.Init(); //initalise the entity
And once the object has finished its job we release it like this:
entity.Shutdown();//destroy the entity
Of course this requires some discipline to call the init and shutdown methods but once you know that, you can work around it. It is a step back to other languages that don’t support garbage collections but certainly makes things a lot easier for the garbage collector as it doesn’t have to step in and clean up after us.
You have some named parameters in your Effect, right? Then you need to set some values. However, every time you do myEffect.Paramter[“ViewMatrix”].SetValue() you will find that memory has been allocated and then “collected”.
To avoid that you can simply pre-cache those in member variables during initialisation:
ShaderViewProj_ = Shader.Parameters["ViewProj"];
And then during run-time use them directly like this:
Works a treat!
Every time you assign, copy, concatenate or substring your strings – you will end up with some garbage. If you do that a lot you will end up with a lot of garbage and we do that a lot. :)
Pretty early on we decided we needed something to handle that. Fortunately the framework provides a class called StringBuilder. You can use it with the SpriteBatch and that’s all the good news you need, really!
We have a special wrapper on top of StringBuilder, conspicuously named StringFormat, that helps with formatting and concatenating numbers, strings and other data with StringBuidler.
Then we have a pool of StringFormat classes and we reuse those in our HUD system.
We were aiming to support about a 1000 records per map per machine, a bit presumptions - we know. To avoid massive garbage leaks we had to make memory pools for lists with a lot of records in them so there is fair amount of memory hanging around when the system is loaded but none is leaking garbage.
We totally forgot, toward the end, about the “Parameter” fiasco and the game shipped with a few [garbage] leaks when we pitch shift sounds. We use XACT and set local and global variables to control various parameters of the sounds – however it’s not easy to pre-cache some of those parameters, so we simply decided it wasn’t worth the effort.
Those memory pools are sweet but get some debug info drawn on screen and track what they do because leaking objects from them could mean an out of memory condition as the garbage collector won’t collect them ever.
Allocating a lot of managed memory every frame is a performance disaster; however, trying to get rid of every last one of those allocations is probably a waste of time. Make sure you don’t allocate memory every single frame – beyond that it’s probably not worth it.
Carcophony uses three threads. One for the main game, one for file access and one for high score exchange. We intended these threads as a method to avoid any stalls during game play as these were operations that could occur in the background during the game – saving “license points” (challenges) progress or exchanging and processing high-scores.
The file thread was a worker thread that just executes file operation thread commands. These would be fairly large granularity like – “save challenges” or ”save high score tables”. When the file thread picked up a file command it would open a container and then write the data into the file.
Although this plan worked to a large extent, we never managed to completely get rid of reduced frame rate during such operations. As there are no low level profile tools we can only speculate here but it looked like “open storage container” operations would make impact on the game thread regardless which thread executes them and avoiding them would improve things. However, the alternative was to keep a storage container open and we rejected that as it lead to other problems.
A mistake we had made early on was to try and check every frame if storage device is connected on our game thread. That seems to allocate managed memory per-call and put strain on the framework. Later on we moved to using storage device events and that solved most problems.
The high score thread runs in the background and exchanges scores with the other users currently playing a game. When scores are exchanged the high score thread would schedule a save command on the file thread. This was dictated by the fact that the file thread was the only one that could operate with the file system and serialising those operations made a lot of sense, especially when it comes to storage device manipulation and handling.
The data manipulated by the high score thread was need by both game thread and file thread and it could be the case that the high score thread could be working on the data at the same time. We opted for a double buffered data where the thread that manipulates that data is the only one that would write data in the write buffers, and then it would flip the buffers. The other threads would only read the data from the read buffers. So the only “atomic” operation would need to be the buffer flip operation and everything else would be safe. Depending on the frequency of read and write operations there might be a need for triple buffering or more complicated thread saftey mechanism. For us, due to the way the system worked, a double buffered solution worked fine.
Note: this is not dissimilar to how some lock-free data structures work.
Initially we didn’t plan to use a global high score system. As the game development progressed we realised that a global high score system would provide extra depth to the game. It’s always more fun to try and beat your friend’s score :)
We started by looking at Jon Watte’s excellent high score component . It worked perfectly well but our requirements were a bit different: we wanted to handle a large number of records, to have a separate high score table for every map as they differ in complexity and strategy, to not have any managed memory allocations and garbage (including string) and finally we wanted to run the system on a separate thread to avoid any stalls when data sets got large as we do a lot of sorting and data manipulation.
Note: even running on a separate thread, without any stress on memory allocaiton, it seemed that network exchange did have an impact on the main thread – possibly something we did wrong there but the impact was negligible and we didn't purse it further.
Here is an outline of how the system works:
1. Load the high score tables
2. Prepare, sort and merge the data so that it is in a suitable format
3. Attempt to find other users that are playing the game
4. Exchange data with those users if any are found
5. Create a host session and wait for other users to connect with us and exchange data
6. If any data has been exchanged schedule another sorting and processing task and save the data to storage
Here are some of the details we thought were worth mentioning:
Data needs to be conditioned for two main purposes: in game visualisation - read data and communication with other players - send data.
We planned and tested for quite a lot of records and one of the challenges was to keep the send data small, so we had a hard limit on how many records we can send per connection.
Once we decided on a limit we had to come up with some algorithm that would propagate the tables but without going over the limit. Here is what we came up wiht: we would always include the top scores per-table, because we’d want to know who’s the best. We would always include local and friend’s scores because we’d want our friends to see our scores even if we are not the best. Then we fill the rest of the allocated data with a random selection of records from the rest of the table.
We had a few different sets of data, in some cases double buffered due to the threading nature of our high score system:
• Main database – this was the main data structure where records were maintained; this data was only manipulated by the high score thread and no other threads would touch it
• Receive buffer – this was a buffer of records where scores that had been received from other players
• Pending scores – these are scores that have been generated locally (by playing the game) but for some reason weren’t authorised for online distribution – we only distribute global scores from valid Xbox LIVE profiles.
• Send buffer – this was a buffer that contains a limited number of records that the local system will share with any connected users
• Aggregate buffer – this was the buffer that contained aggregated lists in display friendly format available for game usage and visualisation; it was also used to save the data to disk; it was double buffered as it could be read by the main game thread and the file system thread and written only by the high score thread when data had been processed and sorted; usually this buffer was then further processed to create the “send buffer” as data was
Note: some of these buffers were actually multiple buffers – one per map representing different high score table.
Pretty early on we realised that tweaking UI elements in code is huge time waster. So we created a small UI editor in C#.
It may look like a huge task but if you have experience with editors [and if you don’t it’s a good idea to start now] knocking one out in .NET is a matter of days if not hours. Ours was ready within a week – it improved over the course of development but not much. It not only allowed us to tweak UI elements for the game but we could also see what we are going to get in game, as you can have a windows form that runs XNA framework .
We made a simple map editor, again in C# and windows forms. The idea was to be able to make a lot of maps and we did try a few. However, it turned out during development that certain game play decisions limited the size and topology of the maps.
Having a map editor at our disposal however did allow us to tweak some the maps almost to the end.
Looking through the “evil checklist”  would give you an idea what technical cases other XNA developers struggle with. We did that nice and early. However, what we didn’t do is research what the common practice of handling these cases is.
To do that you have to peer-review other games and read what people pick up on in the review threads of the games. Don’t postpone reviewing other games for the time where your own game is in review, do that as early as possible so you can see what other people get failed for and what reviewers pick on as defects. You’ll be surprised – we certainly were.
Controller and profile management issues are common source of problems. Getting them sorted early in the project will save you a lot of trouble... or at least thinking about it and having a good set of rules how you are going to handle these.
Getting Carcophony through the playtest and review process was a fairly smooth experience. We can say that play testing definitely helps but does not guarantee a problem free peer-review. You cannot rely on playtesting alone to find all the problems and crashes with your game – you have to do that as much as you can. If you don’t do it – reviewers will. Even if reviewers don’t find any major problems and your game gets approved your users will find those problems.
Note: we discovered a crash bug in multiplayer mode about two days after release - it didnt feel great.
Problems that were picked up during playtesting:
• safe title area – although we were aware of that at the start of the project, it got forgotten during development and it was difficult task to dig ourselves out from that one, after user interface had been finalised
• small fonts – we followed Microsoft best practices  and Carcophony does not contain any fonts smaller than 14px. However, fellow peer-reviewers and play testers flagged that 14px (and even 16px) fonts were very difficult, if not impossible, to read on small SD televisions.
• memory unit problems – we had neglected storage management during development process and having to fix these problems after the game had gone in play testing was challenging
Review process was straightforward, although couple of minor issues were flagged up, some of these in areas where we had specifically made a point of checking before final submission.
Reviewing other games certainly seemed to encourage people to review our own game. However, as mentioned earlier, do not leave peer-reviewing other games for that final stage – it can be source of valuable information before you reach the final peer-review stage.
During development we lost some data on couple of occasions. Firs time one of the HDDs just got it’s files mixed and lost about a week’s worth of work. Second blow came as the main HDD of one of the dev machines completely stopped working one morning. We’d been very careful to back most source data but as it turns out we had forgotten some of our music source files...
To top this up just as we went into play testing and started to realise we had to deal with a few xbox specific issues like memory units and safe frame... our xbox broke. Luckily we did have a spare one.
So, if we had to sum up our experience in a few words:
• pre-calculate as much data as you can
• don’t allocate memory every frame but also don’t get too paranoid about allocation
• shifting file operations to another thread is a winner
• UI editor is a must for fast UI production/tweaks
• reviewing other games will give you are good idea what your game will fail (or not fail) on
• think about your storage device management and error cases – don’t leave it for the last moment
• back everything up as often as you can
Thanks for stopping by and reading this article. We would like to thank everyone who play tested and peer-reviewed our game and in particular Mr Aubergine and ColdBeamGames for repeatedly coming back and giving us valuable feedback.
 Shawn Hargreaves, 2008: Understanding XNA Framework Performance, Profiling Tools [http://bit.ly/4vfkI3]
 Frank Savage, GDC 2009: Faster How to Improve XNA Game Studio Title Performance, GC Analysis [http://bit.ly/3LaTNX]
 Jon Watte, The XNA Highscores Component (distributed leaderboards, sort-of) [http://www.enchantedage.com/highscores]
 WinForms Series 1: Graphics Device
WinForms Series 2: Content Loading
 The Evil Checklist for playtest/peer review
 Best Practices for Indie Games