Skip to content

Optimizations and Improvements

Today is my brother's birthday - the second one that he hasn't been here to celebrate with us. Each day is still a struggle, and special days like this are even more so. The band Bayside has a song called Winter, which talks about the death of their former drummer John Holohan:

Should we still set his plate?
Should we still save his chair?
Should we still buy him gifts?
And if we don't, did we not care?

Anthony Raneri has a way with words, and unfortunately this is far from the first death that made me think of this song.

In an attempt to distract myself again, I decided to look back and see where the game was at around his birthday last year. It looks like we had just finished adding audio to the engine, and began implementing the ProgressableRenderFeature in preparation for the first healing prototype. While we've certainly come a long way, I'm glad that the main idea of that work is still core to the game today.

I'm proud that the game is really about healing without killing - it seems like the two have to go hand-in-hand, since there is no point to healing without taking damage. Instead we've settled on a design that involves the player hurting themselves in order to score the most points. Maybe the design of this game is a metaphor for real life or something... I don't know.

I'm still waiting for Google to approve the most recent Android build which includes the leaderboard. I've been thinking more and more about posting the game on the World of Warcraft subreddit, in order to hopefully get feedback from players that actually enjoy the healing gameplay. The prevalence of Android amongst those players was the primary reason that I spent a couple of weeks focusing on the platform, but during that time, my daughter and my neighbor both asked about Android support so that they could recommend the game to more people. I'll be happy to allow more people to play the game, and I now have a base for any future mobile games that I decide to build.

I've pretty much given up on any sort of deal with Apple. A friend recently sent me an article about the decline of Apple Arcade. It's purely speculative, but it certainly made me feel like our game is simply inadequate to be placed on such a platform alongside more "premium" titles.

As it stands, I feel like the game just has to be entirely free. The dozens of testers I've had so far all love the ad-free UI, myself included. I had being bombarded with advertisements literally everywhere I look. Unfortunately, by making the game free without ads, I feel like I'm contributing to a dangerous precedent in which game developers are not rewarded for their efforts. Building and releasing an entire game from scratch will certainly provide me with opportunities in the future, but it's really hard to quantify that.

I'm going on a big trip in a couple of weeks, and I'd like to move into a more public testing phase before then. While I wait for Google's approval, I can focus my efforts on optimizing the game.

24.1 Physics

My daughter has an old 2nd generation iPhone SE from 2020, and while the game is completely functional, there are some noticeable hiccups when there are too many objects on the screen - precisely when you don't want any hiccups!

I did some performance profiling of Alfredo in CLion, which told me exactly what I expected: the CollisionSystem is the primary offender.

Pre-Optimization Profiling in Alfredo

The isn't surprising at all. The CollisionSystem uses a very brute-force O(n^2) algorithm which iterates over all PhysicalStates within a loop that also iterates over all PhysicalStates. This was good enough for the prototyping phase, but it's definitely not production-ready - even if it "works", it's just a waste of battery life on mobile devices.

What's more impressive is that Alfredo's render frame rate isn't even limited to the screen's refresh rate, so while fixedUpdate() is limited to 50 ticks per second, update() is ticking as fast as it possibly can (between 400-500 frames per second), and the CollisionSystem still makes up for ~3/5 of the total runtime cost. Enabling v-sync in Alfredo results in the CollisionSystem making up for a whopping ~75% of the total runtime cost!

Pre-optimization Profiling in Alfredo with V-sync enabled

Keep in mind that this computer's display refreshes at 120Hz, while a 2nd generation iPhone SE only refreshes at 60Hz. Clearly this would be the single most impactful optimization we can make.

Quad Trees

Luckily, there are many well-established solutions to optimize physics calculations by reducing the total number of comparisons that occur each frame. One of the most common solutions in 2D games is the quadtree - a tree data structure in which each node contains 4 children. The purpose of the data structure is to organize game objects into spatial partitions. This allows us to skip expensive calculations between objects that are in entirely different partitions. The 3D counterpart of the quadtree is the octree, and you could theoretically make an N-tree for any given dimensional depth.

This series by David Geier does an excellent job of concisely explaining octrees and some of the trade-offs that must be made when implementing them. While David focuses on octrees, the contents of the posts apply perfectly to quadtrees as well.

My implementation has the following properties:

  • The entire tree is encapsulated within a World, containing a pointer to the _root.
  • The World can dynamically expand by creating a new _root, which contains the old root as a child.
  • Each node contains its own BoundingBox representing its extent.
  • Entities contain their own BoundingBox (added as a component), representing the smallest box that contains their collider (or ray).
  • Entities are contained within the deepest node whose bounds entirely contain their own bounds.
  • The World (and all QuadTree nodes) are updated every frame in order for moving objects to move between nodes.
  • Empty leaf nodes are dynamically pruned in order to prevent infinite memory growth as our ship moves through space.

Organizing our tree in this way allows us to iterate over the entities of a given node and compare it to all of the other entities of that node, as well as all of its child nodes. A quick printf() suggests that this algorithm reduces the total number of comparisons from around 20,000 per frame down to fewer than 500 per frame - sometimes much fewer, depending on the positions of the stars!

Running the same profiling test on Alfredo with v-sync enabled results in the CollisionSystem making up for only ~13.5% of the total runtime, down from ~75%!

Post-optimization Profiling in Alfredo with V-sync enabled

While this section is short, I actually spent an entire Sunday working on the code. I started by building the QuadTree without any sort of Entity integration, only using BoundingBoxes. Once I got the basics working, I added the World to enable dynamic growth, enabled by exchanging the _root node over time. The WorldNode component used to contain an internal ID managed by the World (similar to Renderables within the Renderer), but they turned out to be useless. Finally, I refactored all of the function parameters to use Entity references instead of BoundingBoxes, to allow the CollisionSystem to continue doing its work using the various collider components.

There were bugs along the way, of course. At one point I got rid of all of the stars just so that I could debug why collisions between the ship and asteroids weren't working correctly! It turns out that, during my BoundingBox to Entity refactor, I broke some of the recursive logic in the QuadTree that enabled objects to move between different nodes. The result was that, according to the QuadTree, the ship level moved from its original location! An easy fix, but a nightmare to debug.

That is actually the worst part about this implementation: it sacrifices readability for performance. It's not particularly complicated to understand how a quadtree is supposed to work, but the code is not trivially readable - at least not compared to the simple O(n^2) nested loop of comparing all entities to all other entities. This is true of many optimizations, and one major reason for avoiding early optimization entirely. While the performance gains can be substantial, the additional complexity can be a hindrance when building a game from scratch.

There was one little change I made that I want to call out specifically: I added a * operator to the Component<T> class to retrieve a reference to the actual underlying data. This allows me to do things like replace the entire contents of an entity's BoundingBox:

linguine/src/systems/CollisionSystem.cpp snippet

auto boundingBox = BoundingBox{};

// Configure boundingBox
...

auto boundingBoxComponent = entity.get<BoundingBox>();
*boundingBoxComponent = boundingBox;

While this is neat, I need to be careful about using this, because the reference returned by the operator is completely invalidated if I add or remove a component to that entity!

24.2 Component Queries

I'm going to start off by saying that I'm not going to change anything about how we query for entities at this phase of development. I wanted to address it because the profiling results clearly show that these queries make up for the majority of the game's runtime, now that the physics system has been optimized, making up for about 50% of the game's runtime.

The reason is pretty simple: even though the archetype graph is itself an optimization over iterating over all of the entities blindly, it's still an O(n^2) algorithm to iterate over all the archetypes and find the ones that are relevant to the set of components that we are looking for.

I don't necessarily think it was a bad idea to implement an ECS architecture at the beginning - it has definitely simplified the iteration of the design process. However, ECS excels in performance when you make few queries that contain lots of results, allowing you to rapidly iterate over those results with incredible performance due to CPU cache coherence. Our game does not contain very many entities, nor are the entities that it does contain particularly similar.

Rather, the thing that this architecture has afforded us has been the ability to rapidly refactor the game systems without worrying about object structure. Adding new functionality is as easy as creating a new System, adding in some findEntities<>() queries, and mutating the results as needed. No need to worry about entities pointing to other entities or which functionality they each have, the data and logic are entirely separate.

However, it probably would have been just as easy to follow a traditional entity-component architecture, where each component contains both data and logic, and entities contain sets of components (similar to Unity). Maybe I'll give that a shot for a future project, or maybe I'll try to further optimize our ECS queries.

For now, I just want to get this game released, and its current performance is good enough.

24.3 Leaderboard Adjustments

Quite a few people have been playing the iOS version of the game, including some of my friends, and a handful of kids from my daughter's school. It's been fun watching them compete over leaderboard positions, though none of them have managed to beat my own high score yet.

One score entry in particular contains a username that is in Hebrew. My custom-made font obviously doesn't support Hebrew characters, which isn't really the problem - the username is so long that it covers up the score entirely!

Long leaderboard name with unsupported characters

Since there is space for 26 characters across the horizontal space of the leaderboard row, I'll generate a max length for the name by subtracting the length of the rank and score strings (and spaces), and trim the name based on that length.

Trimmed leaderboard names

24.4 Use-After-Free Errors

I decided to test the recent changes on Android, and to my surprise, the game crashed as soon as I tried to leave the title scene. Apparently the World (owned by the scene) was getting destroyed before the removal listener for the BoundingBoxes were getting called, leading to the error.

Rather than dealing with straightening out the order of operations, I just made the scenes and CollisionSystem hold a std::shared_ptr of the World.

24.5 The Tutorial... Again...

New players don't seem to understand that they are supposed to heal the shields at all. The current new player experience spawns asteroids until they collide with one, and then stops spawning them altogether until the player heals their shield for the first time. The shield is highlighted with a glowing orange border, but that doesn't seem to be enough for players to realize what they are supposed to do. Many players simply assume the game is broken because nothing is spawning!

If I continue to spawn asteroids, then they might end up killing themselves without realizing what they could have done to prevent it, though I guess that's preferable to thinking the game is entirely broken.

I'll see what happens if I keep spawning asteroids without requiring a heal, but only start spawning mines and powerups once the player has learned how to heal. I'll make the glow a bit more apparent by pulsing the size of the frame, as well as the transparency.

Making the change to allow asteroids to keep spawning while waiting for the heal introduced a new situation in the spawn system in which the tutorial could end at unexpected times, which would cause a lot of new asteroids and obstacles to appear on the screen in weird positions. I did a little bit of cleanup, and I guess we'll see how people react.

24.6 Tandem Releases

Version 0.0.5 is the first time that I've released the game on both iOS and Android at the same time! I'll have to wait for both Apple and Google to approve the new version, so there's no guarantee that the new versions will become available at the same time, but it's still a pretty cool milestone for me.


Well, Apple managed to approve the new version before I even woke up the next morning. Google, on the other hand, took 4 business days before the new version was available.

With the new version available, I decided to try my hand at posting the game to the WoW subreddit. Unfortunately, the post was taken down by moderators because it wasn't actually about WoW. Honestly, I don't know what I expected. While the post wasn't up for very long, it still managed to reach the "front page" of the subreddit and generate a handful of installs - 5 on iOS and 1 on Android.

With that post completely flopping, I decided to post to the gamedev subreddit instead, focusing on this devlog rather than the game (but still providing links to the game). Unsurprisingly, the devlog got quite a bit of traffic, but only 1 or 2 people actually installed the game. It turns out that game developers are more interested in how you built your game rather than playing the actual game.

I can't help but feel disappointed by the lack of responses and feedback. After working so hard on the game and feeling very proud of it, the silence is disheartening, and generally indicates that the game doesn't stand out at all. It's not good enough to praise, and it's not even bad enough to criticize - it barely exists.

I'm genuinely not sure where to go from here.