Skip to content

Getting Started

     Alright, let's do this. In the introduction, I wrote a simple "Hello world" application in C++. While the code itself is syntactically correct, I can't actually execute that code without compiling it first. In school, they taught us how to compile our code using the compiler directly - something along the lines of g++ -o hello_world main.cpp, which would allow us to execute the program with ./hello_world. On second thought, did I actually learn that in school? I don't actually remember anyone teaching me that, I just remember doing it for school assignments. In any case, this style of compilation may be fine for school projects, in which all of your logic lives in a couple of files, but it becomes unwieldly as the project becomes more complex, be it due to a large number of files, cross-platform compatibility, dependency management, or otherwise.
     Larger projects tend to use some flavor of build system, which can drastically simplify the management of compilation logic. Choosing which build tools to use generally comes down to preference, but sometimes it is clear when a tool isn't well-suited for a particular job. For example, Microsoft's Visual Studio and MSBuild are widely popular for compiling video game projects written in C++, but the macOS version of Visual Studio does not support C++, so it doesn't do us much good in this case, despite its popularity in the Windows ecosystem.
     I don't want to give a rundown on every possible build system and their pros and cons, so I'm just going to come out and tell you that I will be using CMake 3.21.4 and Ninja 1.10.2, and my compiler will be Clang 13.0.0. CMake is remarkably simple, while providing the extensibility that this project will require. Ninja provides blazing fast compilation speeds, at the expense of readability, which is no big deal because we'll be using CMake to generate Ninja's build files. I don't have strong opinions in the Clang vs. GCC debate, but Clang is the default compiler on macOS, and it should be fine for any platforms I might happen to target in the future, so I'm running with it. Since this version of Clang doesn't have full support for C++20 yet, we will be using C++17.
     For my development environment, I will be using JetBrains CLion 2021.3.3, which I might update throughout the duration of the project. My 14" MacBook Pro is currently running macOS 12.1 "Monterey", and contains Apple's M1 Max SoC, which has 10 processing cores (8 "performance" cores and 2 "efficiency" cores), 64GB of memory, and a 32-core GPU (which has shared access to the same pool of memory).

2.1 Project Structure

     I need a code name for this project. Hang on a second, I'm asking my wife for a name. She chose "Linguine", so let's get this show on the road.

main.cpp

#include <iostream>

int main() {
    std::cout << "Hello, World!" << std::endl;
    return 0;
}

CMakeLists.txt

cmake_minimum_required(VERSION 3.21)
project(linguine)

set(CMAKE_CXX_STANDARD 17)

add_executable(linguine main.cpp)

     These are the files that CLion generated for me, but I would have written them the exact same way, except for the tacky comma and capital "W" in the output string. At least they opened the curly brace on the same line, right? The code compiled and ran just fine without changing any of my default settings. I just double-checked that the IDE is properly using the correct versions of all of the build tools.
     There are a few directions I can go from here, and it's actually kind of a difficult choice. I won't be able to create my iOS application from this IDE, at least not that I know of. I'm sure it could be possible, but I'd lose out on a lot of really convenient features, like the ability to launch and debug the app to a simulator or an actual device (you know, the features that warrant using an IDE over a text editor). It is more likely that the code I write for my "engine" will be exported as a library, which can be imported into an Xcode project, where I would wire up all of the platform-specific functionality into the API for my engine library. JetBrains also offers AppCode, which might play nicer with my CLion installation, but I can't know that for sure.
     Does that mean I should go ahead and whip up an empty iOS app? I don't think so. There's just no point in doing so yet, since there's no engine library to hook into. Instead, I think it would be a more valuable use of my time at this point to start building the engine as if it were a plain-old C++ application. As we get a better idea of what platform-specific details our engine will need, we can refactor the engine to take them in as dependencies rather than constructing them itself.
     This is all purely theoretical, but based on previous experience. There is one decision that I'd like to make right now though: to what extent should I support running the game on my computer? It becomes way easier to debug the engine and game code if I can run it directly from CLion, rather than compiling the engine into a library and importing it as a dependency to another application entirely. However, the entire reason I decided to only support one platform was to reduce the amount of time I would spend writing platform-specific code.
     I think the answer lies somewhere in between "fully supported desktop application" and "no desktop application whatsoever". The desktop version of the application doesn't have to be polished or deliverable, it simply needs to fill the gaps that would exist if I only had an iOS app. I should be able to share quite a bit of platform-specific code between iOS and macOS - for example, both platforms use the Metal API for graphics and the Core Audio API for, well, audio. Anything that can't be shared should not be considered production-grade in the desktop app, but should be thoroughly tested in the iOS app. While the focus of the iOS app should be releasability, the focus of the desktop app should simply be rapid iteration.
     So let's refactor our CMake project a bit - "Linguine" will be a library, as we discussed, but we will also make an executable that depends on Linguine - let's call it Alfredo, kind of a pun on the "alpha" version of the engine. Just as a sanity check, I'll add a static method to Linguine which simply returns a string, and I'll print out that string from Alfredo.

linguine/include/something.h

#pragma once

#include <string>

namespace linguine {
  class Something {
  public:
    static std::string GetIt();
  };
}

linguine/src/something.h

#include "something.h"

namespace linguine {
  std::string Something::GetIt() {
      return "Hello world!";
  }
}

alfredo/src/main.cpp

#include <iostream>

#include "something.h"

int main() {
    std::cout << linguine::Something::GetIt() << std::endl;
    return 0;
}

There, I fixed the string. All joking aside, I'll be loosely following Google's C++ style guide. I'm almost positive that I'll write some code that doesn't follow it, especially without a team of developers scrutinizing my every variable name. I've actually already broken the rules of this style guide simply by using the .cpp file extension instead of .cc - let's just say I'll be using my own consistent styling inspired by Google's style guide. In any case, this code compiled and ran just fine, after adding the proper CMakeLists.txt files, but I'll spare you those details.

2.2 The Game Loop

     Game engines are complex beasts, but their function can generally be boiled down to a few steps:

  1. Receive inputs
  2. Update game state
  3. Output results
  4. Repeat

These steps happen very quickly, ideally at least 60 times per second, maybe even up to 120 or 144 times per second on modern devices (the latest iPhone 13 Pro has a display capable of refreshing at 120Hz).

The basic implementation of these steps is called the "game loop":

int main() {
    while (true) {
        receiveInputs();
        updateGameState();
        outputResults();
    }

    return 0;
}

A simple game may execute these steps in-order as shown, while more complex engines may implement multi-threading or job systems to allow the steps to execute concurrently. In my experience, the final implementation is never so simple. While there is beauty in simplicity, the reality is that we have to work with what we're given: platform APIs tend to impose certain restrictions that we don't necessarily have control over. It is these types of problems that we must overcome to create a performant, yet maintainable, game engine.

The "inputs" that we must receive are generally things like keyboard events, mouse movements, or touchscreen gestures. We must receive these events from the operating system in order to use them for our game logic. Additionally, we may choose to receive network traffic from a central game server, or gather data from non-traditional input devices (for a game), such as a camera or microphone.

The "game state" consists of all the things relating to our internal representation of our virtual world. This might include the amount of money your character has, how many hats he's collected, which direction he's currently facing - everything! With the knowledge of the most recent inputs, we can update our game state based on the amount of time that has passed since the last time we updated it. This "time step" can fluctuate based on how much work has to be done during each iteration of the game loop, but is often dependent on the duration of the output phase of the loop.

The "outputs" include all of the ways that our game state can be conveyed to the players. The most obvious output is the rendering logic, which allows us to visualize what is happening within our virtual world to the player. Sound effects and music are often overlooked, but are incredibly important for the player's immersion. We might also transmit our updated game state to a central server, in the case of a multiplayer game. Additionally, mobile devices and game pads often include vibration motors for haptic feedback which our game can control.

Since a single iteration of this loop involves rendering an image for the display device to show to the player, an iteration of the loop is synonymous with a "frame".

There isn't really a single "correct" starting point when building up the scaffolding for a game. If I had decided to use a pre-packaged game editor, then I could jump right into building out (and iterating on) the mechanics of my game. That is what I would recommend for most people who want to build games, but it wouldn't be very interesting for me. Hopefully I will eventually be able to display the level of control that building up your own engine affords, without all of the unneeded bloat.

2.3 Window Creation

I've never actually built anything specifically for Apple devices before. In my previous game engine projects, I used GLFW or SDL2 for creating windows and receiving inputs, simply because those were cross-platform solutions that allowed me to run my game on macOS, Windows, and Linux without any changes. Similarly, I've used OpenGL and Vulkan for rendering in those projects, though OpenGL is deprecated on macOS and iOS, and Vulkan isn't technically supported without the use of a runtime translation layer (like MoltenVK), which had some unexpected quirks that required some workarounds.

This time, I'll be writing my application using Apple's officially supported APIs - AppKit for macOS, UIKit for iOS, and the Metal API for rendering on both (though the actual implementation might need to change between the platforms). It looks like sometime in the last year, Apple released a C++ library for Metal (aptly named metal-cpp), and, within the ZIP file containing a tutorial, there is also a metal-cpp-extensions library which provides a handful of convenience functions. The tutorial is called "Learn Metal with C++", which I found at https://developer.apple.com/metal/sample-code/. Using these libraries should be much more simple than building a Swift-based application that invokes my C++ core logic over a C interface, but time will tell.

At first glance, the Metal API appears very similar to the Vulkan API. These verbose rendering APIs, along with DirectX 12, are collectively referred to as "modern" rendering APIs, due to the amount of power that is enabled by having such fine control over the GPU. Older rendering APIs (like OpenGL) are very simple to use, but require heavy graphics drivers to translate the high-level commands into a series of low level instructions. Exposing the lower level commands both frees up the GPU manufacturers from having to implement these one-size-fits-all solutions into the driver, and gives the application developer the control to express their intentions properly. Unfortunately, the verbosity makes them a bit difficult to learn, but the added power makes it worth the additional effort. Apple has deprecated OpenGL in order to encourage developers to use Metal directly, but I'm not sure how long it will be before they remove OpenGL from their systems entirely. I could probably make it pretty far into this project using OpenGL, but if Apple decided to pull the plug on it, I'd have to re-write the renderer in Metal anyway. I'm not opposed to learning new things, and architecturally, future-proofing is the right call, so I'll stick with Metal.

I modified the Metal tutorial project to make it work with CMake and CLion, instead of Apple's preference of Xcode. After some tinkering, all of the samples seem to compile and run just fine. I copied metal-cpp and metal-cpp-extensions into a third_party directory at the root of my project, and I modified Alfredo's CMake configuration to include them, as well as link to the Foundation, Metal, and MetalKit libraries.

I generally dislike these APIs, at least when dealing with them from the C++ library. Rather than using C++'s smart pointers, or even manually allocating and cleaning up variables with new and delete, this API instead prefers that every object is allocated using a static alloc() method for each class (which returns a pointer to an instance of the class), and expects that you release() the result when you're done with it. It works, it's just weird that they chose to implement the memory management this way instead of utilizing the already-established patterns of C++. It's entirely possible that this pattern is just a result of the C++ wrapper not having direct access to the "native" memory allocator for which these APIs were originally intended, and I can't hold that against them - it is, after all, convenient for me that the C++ library exists at all!

Each sample in the tutorial is a single file which contains only the bare minimum amount of code required to achieve the result. As such, the structure of each program is very rigid, entirely non-modular, and a bit chaotic to navigate. These programs were written to get to the point. They completely forego common software development practices for the sake of illustrating a very specific requirement. This structure is fine for these examples, but they should not be blindly copied into your own project, expecting them to magically turn into a modular game engine capable of rendering arbitrary numbers of objects containing multiple types of shader programs or textures.

I need to learn from these samples, not rip them off. Once I understand how the samples work, then I can paint a mental picture consisting of all of the necessary abstractions that work together to achieve the larger goal.

The first thing I'll need is a surface which I can draw to. On macOS, this apparently means I'll need to create a MTK::View. The View can be set as the content for a NS::Window, which is literally a visible window that I can see and interact with, so I'll create one of those too. Unfortunately, it can't be that simple - creating a Window doesn't seem to do anything on its own, so it looks like I'll have to learn a bit about the macOS application lifecycle. Luckily, Apple has some pretty good documentation (at first glance) at https://developer.apple.com/documentation/appkit/app_and_environment. It looks like I'll need to get an instance of NS::Application using the NS::Application::sharedApplication() method, which I can then run() to start the main event loop from the window manager. These events will eventually let me handle things like window resizing, menu options, and even input events (very handy for a video game). However, the call to run() seems to block indefinitely until I kill the process, so when am I supposed to initialize the window? The answer is that I can set an NS::ApplicationDelegate on the NS::Application instance using the setDelegate(NS::ApplicationDelegate*) method, which will allow me to handle lifecycle events, such as applicationDidFinishLaunching(NS::Notification*), which appears to be the perfect event to initialize my window:

Delegates can implement this method to perform further initialization. This method is called after the application’s main run loop has been started but before it has processed any events.

That seemed to do the trick! The window is actually pretty hard to even notice - it opened in the background, there is no entry on my dock indicating that the application opened, and even when I change my focus to it, there are no menu items, and command+Q doesn't quit the application like you'd expect. Closing the window works, but the application continues to run until I stop it from my IDE. Let's tinker around a bit more and see what we can do to make it act more like a normal window.

I've additionally implemented applicationWillFinishLaunching(NS::Notification*) in order to create menu items for the application, as well as setting the "activation policy" to NS::ActivationPolicy::ActivationPolicyRegular, which enables the application to have a dock icon. Constructing the menu is a very compositional process, which would apparently be a lot cleaner had I used SwiftUI. I additionally implemented the applicationShouldTerminateAfterLastWindowClosed() method, since this application should never be running without a window. I won't bore you with the actual code - it is very similar to what was included in the Metal tutorial's sample program for window creation.

I do find something about this rather peculiar though: since invoking run() on the application blocks until the application closes, there is currently nowhere for me to manage my own game loop. Ideally, I can just poll for system events on my own (in my "inputs" phase), which I can then make sure all get processed before updating my game state. When in doubt, Google it! All I did was search for "nsapp run" - the first couple of results are Apple's official documentation, but the third result is a Stack Overflow post entitled "How to make [NSApp run] not block?", which is exactly what I'm trying to accomplish! It seems that the question is specifically trying to emulate how GLFW treats windows, which (as I mentioned before) is what I'm accustomed to working with. The answers indicate that the secret sauce is to call stop() on the NSApplication (called NS::Application in metal-cpp-extensions) from the applicationDidFinishLaunching(NS::Notification*) method, which causes run() to return, but instead of destroying the window, just continue polling for events manually. Let's give it a shot.

Immediately, I noticed another peculiarity: the NS::Application class does not actually contain a stop() method, even though it's clearly documented. The terminate() method exists, but that is much more destructive than stop(), so it looks like I'll need to modify the metal-cpp-extensions library. I basically just copied how the terminate() method was defined and renamed everything to stop instead. It looks like the whole library is structured around sending "messages" to the system. Adding app->stop(); to my applicationDidFinishLaunching(NS::Notification*) method results in the window opening and closing very quickly, as expected. If I add an infinite loop after my call to run(), then the application freezes with the window still displayed, which means a couple of things:

  • run() returns almost instantly instead of blocking until I close the app
  • Which means stop() was called
  • Which means applicationDidFinishLaunching(NS::Notification*) was called by the system as a result of calling run()

So far, so good. Now I need to modify my infinite loop to poll for events so that the window doesn't completely freeze. Similar to the stop() method, I'll have to add the nextEvent() method to the extensions library. Unfortunately it's a bit more complicated, since the method takes in some parameters whose types are also not yet implemented. This is already getting under my skin, but I will be patient for now.

The type of the first parameter of the nextEvent() method (according to Apple's documentation) is a NSEvent.EventTypeMask. When I search through the docs for that type, they state that it is a struct and that there are many static globals that represent different values that the struct might have, such as leftMouseDown, beginGesture, or applicationDefined. There is no explanation of what data members the struct might contain, but there is a hint in its init() function, which takes in a rawValue of type UInt64. It seems that all of these values are represented by a single 64-bit unsigned integer, so I'm not sure why they are calling it a "struct". I managed to dig through the header files in the developer SDK (which gets installed along with Xcode), and found the values for the possible masks within NSEvent.h. Each mask appears to be a bit-shifted integer, in which the 1 is shifted to the left by the same number as the enum value being represented by that mask. The only value that breaks this rule is the mask for "any", which is just all 1's (NSUIntegerMax, which is defined to be ULONG_MAX). This type of masking pattern allows the developer to bitwise-OR multiple possible mask values together into a single value. Therefore, setting all of the bits to 1 is a quicker way to represent "any" than bitwise-OR'ing all of the other possible values together. Unfortunately, I can't just include these header files directly, because they are Objective-C header files, not C or C++.

I could spend my time porting values from the Objective-C headers into my already-modified extensions library, but honestly I didn't expect this library to be so incomplete when I decided to use it. I think I'd like to take a step back and evaluate other options for building a macOS (and eventually iOS) application, while continuing to keep the core of my game written in C++.

2.4 Exploring Options

JetBrains AppCode

I mentioned this IDE earlier in the chapter, and while it intrigued me at the time, JetBrains actually decided to sunset the product within the last month. While I'm sure their reasoning is sound, it's a non-starter for me to work on a new project with an already-dying technology.

Pivot to Swift or Objective-C with Xcode

I didn't want to learn a new programming language as part of this project, but working directly in Xcode would make iteration times for iOS minimal, since it trivially builds for and deploys to the iOS simulator, as well as physical devices. I'm not particularly worried about learning the iOS SDK, but I am concerned that the memory model of these languages might not be what I'm looking for with a game engine - they could be just fine, but I assume the worst with unknowns.

I've barely written any code so far, so pivoting now wouldn't be a big deal. The main reason I don't want to use Xcode is because it locks me into only supporting Apple devices. While my primary objective is to release the game for iOS, if I ever found myself wanting to port the game to another platform (Android, for instance), then I'd have to rewrite the entire game in another language! Furthermore, I couldn't even develop the game on a non-Apple computer. If something happened to my laptop, I couldn't even open the project on my Windows or Linux desktops.

Interoperability between C++ and Swift or Objective-C

I structured this project such that the underlying game code could be written exclusively in C++ (this is Linguine), and platform-specific projects (like Alfredo for macOS) could link to the game code library. It is entirely possible that the code required for a given platform is unable to be written (or at least prohibitively difficult) in C++. In this case, we could write the platform-specific code in another language entirely using the platform's preferred frameworks, and ideally still be able to link to the shared game code library.

This introduces quite a bit more complexity into our project structure. Depending on the platform, our current choices of CLion as an IDE and CMake as a build system may not be compatible. In this case, we may need to compile our game code library separately from the application which uses it, which results in a multi-step build process every time we change something in the game, resulting in increased iteration times.

This seems to be what the Unity engine does. While you develop your game on your computer via the Unity Editor, you can create a build for any platform, provided your current platform supports the tooling required. Your game code gets compiled into a library, which is then linked into a common application shell for the target platform. You then have to install the built package onto a device prior to playing it. Even though the application works on the target device, debugging is a nightmare.

Platform-Agnostic Libraries

GLFW is very easy to use, and supports creating windows and receiving inputs on macOS. Browsing through their Github repositories, it appears that they implement their macOS window management in Objective-C, even though the library exposes a C interface. Encapsulating that implementation is clever, and the average user of the library would never have to know. While GLFW would quickly unblock me on macOS (and work just fine on Windows and Linux), it unfortunately does not support iOS or Android.

SDL2 is a bit more promising, with support for all of the platforms that I'm concerned about. It looks like they still require creating an Xcode project for iOS targets, which sort of means that it's not buying me as much as I had hoped.

SFML had iOS support in the past, but it doesn't appear to be supported anymore, and the tutorial for it on their Github page is now just a blank page. That's a bummer.

A Bit of Research

There's obviously a way forward with interoperability, otherwise Unity wouldn't be able to create iOS apps, and GLFW wouldn't implement its macOS window management in Objective-C. I wonder why GLFW chose Objective-C over Swift? A quick Google search shows that Swift has native interoperability with Objective-C, but not C/C++. In order to interop with those, you must first create an Objective-C wrapper around your C/C++ functions. Since you're already writing Objective-C, you might as well just go all-in on Objective-C rather than introducing even more languages into your project.

Whenever I see someone trying to interop Objective-C with C++ (rather than plain old C), I see the term "Objective-C++" thrown around. It looks like there are compiler flags available that let you write a file that can contain both Objective-C and C++, with some restrictions. For example, a "class" in Objective-C is not interchangeable with a class in C++, so you cannot instantiate an object of one language's "class" with the syntax of the other language. It sounds really weird but it actually makes sense in practice. The syntaxes of the two languages interweave to form a somewhat coherent hybrid program. As a quick illustration, let's say I added this function to Linguine's Something.h:

static int add(int a, int b) {
  return a + b;
}

Additionally, I created this class in my Objective-C file:

@interface Calculator : NSObject
- (int)subtract:(int)a from:(int)b;
@end

@implementation Calculator
- (int)subtract:(int)a from:(int)b {
  return b - a;
}
@end

In order to invoke the C++ function from my Objective-C code, I can't actually use Objective-C's function invocation syntax:

// Nope
int test1 = [linguine::Something add:5
                                  to:3];

// Nada
int test2 = [linguine::Something add:5
                                   b:3];

// Zilch
int test3 = [linguine::Something::add a:5
                                      b:3];

In fact, these examples are complete gibberish, since Objective-C function parameters have 2 "names" - one used for passing in the parameter, and the other used within the implementation, except the first parameter, whose external name is the function invocation itself! This is why my Calculator class's subtract method has a parameter whose type is an int with an internal identifier of a, as well as a parameter named from, whose type is an int, and whose internal identifier is b.

Indeed, I can invoke the C++ function using C++ syntax, and then use the result within an Objective-C method call:

// 5 + 3 = 8
auto result1 = linguine::Something::add(5, 3);

// 8 - 6 = 2
Calculator* calculator = [Calculator alloc];
int result2 = [calculator subtract:6
                              from:result1];

While I abhor Objective-C's syntax, it's pretty cool that I can just slap some C++ into it and make it work. In Xcode, this works simply by renaming the file extension from .m to .mm. In CMake, I just need to add enable_language(OBJCXX) to Alfredo's CMakeLists.txt.

Wait, if I can write Objective-C from Alfredo, does that mean I can access the macOS API? Let me just re-write my C++ main method in Objective-C, changing my namespaced C++ objects into their Objective-C equivalents (for example, NS::Notification becomes NSNotification), and voila! The window opens exactly like it did before! I can even call [NSApp stop:nil] within my applicationDidFinishLaunching method and maintain my own event loop using the nextEventMatchingMask method!

Creating an iOS App without Xcode

Another thought occurs to me: if it was this easy to run a macOS program written in Objective-C++ from my CMake project, would it be just as simple to run an iOS app?

I did some Googling and quickly stumbled across the ios-cmake project on GitHub. The project has a lot of stars, and the license is compatible with my intentions, so let's give it a shot. I've downloaded the latest release (4.3.0) and plopped the unzipped folder into my third_party directory.

I've created a copy of the alfredo directory - let's call it scampi - and modified its CMake configuration a bit. I need to compile the app as a MACOSX_BUNDLE apparently, which means I also need to set MACOSX_BUNDLE_GUI_IDENTIFIER and some versioning info. Additionally, I need to link to UIKit instead of AppKit, which means my ApplicationDelegate class needs to inherit from UIApplicationDelegate instead of NSApplicationDelegate. UIApplicationDelegate has a completely different set of methods that I can implement, and I'll have to think through the mobile application lifecycle and how my game should behave with it at some point, but that's a problem for future me.

The magic required to make this work in CLion is to make a separate CMake profile and point it to the newly downloaded toolchain. With that done, the barebones application seems to build just fine, and I can see the output scampi.app bundle in my CMake build folder. Cool, now how do I run this thing? Xcode comes with an iOS simulator, which can supposedly run apps from your computer's filesystem. Can we get access to the simulator outside of Xcode?

A quick Google search shows that I should be able to run the xcrun simctl command from my terminal - indeed, the output is a help dialog, showing a ton of different options with a helpful description: "Command line utility to control the Simulator". It looks like I can list out all the simulated devices with xcrun simctl list, and it shows me exactly that. Each device appears to be configured to mimic a real device, and contains some sort of identifier and its current running state. I'll try to work with the iPhone 13 Pro, since that's the phone I actually own. A quick xcrun simctl boot <device_id> and... nothing happened. The list command says it's running but I don't see anything.

I'll Google "xcrun simctl boot not working". Lo and behold, one of the first results is a Stack Overflow post entitled "iOS simulators boot, but don't appear, with "xcrun simctl boot ${UUID}"", in which a few answers all indicate that I just need to open the Simulator GUI app, perhaps using open -a Simulator. That did the trick.

Now how do I get my app on this thing? Maybe I'll try to drag the bundle onto the Simulator window... it worked! I didn't actually expect that to work. Opening my app results in a black screen, which isn't entirely unexpected since I didn't actually add anything to it. Maybe I'll try to log something when the app opens. How do I do that? I know I can hook into the application didFinishLaunchingWithOptions method of my NSApplicationDelegate. Back into Apple's documentation, I had to select Objective-C as my language of choice rather than Swift, because apparently they do things differently. Evidently, I can construct my own logger with os_log_create, and then pass that logger into the os_log function. I ended up with something like:

NSString* bundleIdentifier = [[NSBundle mainBundle] bundleIdentifier];
os_log_t customLog = os_log_create([bundleIdentifier UTF8String], "ScampiAppDelegate");
os_log_error(customLog, "Hello iOS!");

I'm logging an error so that it stands out more. I successfully built and installed the updated bundle, but how exactly do I view the logs? Google points me to an app called Console (on my computer, not within the simulator). When I open up Console, I see a lot of logs, but on the left side there's a list of devices, and sure enough, I can select the simulator from that list. I can add any arbitrary filter to this stream of logs, but I'll just filter by the process which is called scampi for now.

"Hello iOS!"

With all of that working, I'm going to automate this process a little bit. I have two big reasons for doing so:

  1. It's pretty tedious to type out a bunch of specific commands and open a particular set of apps over and over.
  2. I don't want to forget how to do it! I honestly didn't intend to figure out how to integrate an iOS build into my codebase so early, and it will be a while before I need to focus on it. Automating it now will ensure that it is repeatable in the future.

I added a simulator.sh file to a new scripts/ directory within the scampi/ folder. This script just takes in one of the device IDs, starts up that particular simulator, and opens up the Console and Simulator apps. It waits for me to hit control+C and then shuts down the simulator. I've configured my IDE to run that script from a "simulator" build configuration, using the device ID for the iPhone 13 Pro simulator.

I also realized that I can install the bundle onto the simulator through the xcrun simctl install command, and launch my app via the bundle ID using the xcrun simctl launch command. I whipped up a custom CMake target that builds the bundle, installs it to the simulator, and then launches my app. It also brings the Simulator app to the foreground so I can quickly switch to it and observe the early startup of my app. All I have to do to deploy the app to the simulator is select the new target in my IDE and press the "build" button!

Truth be told, I'd prefer if the "build" and the "run" were separate actions for the same target, but that's actually kind of annoying to achieve with CMake configurations, so this will due for now. A note to myself for the future: I can just create a script that installs and launches the app, then create a target for that script, which has a dependency on the build. Maybe not as annoying as I thought. Would be great if I could get my app's logs to show up in the IDE instead of having to navigate the Console app, but that's a task for a different day.

2.5 Time Management

In Real Life

I've been reading through what I've written so far, and I realized that it reads as if I wrote all of it in one sitting. The reality is that it's been nearly a month since I first downloaded the metal-cpp library. It was a few days (maybe a week) from that point until I got my first macOS window opening. Those metrics are a bit misleading though. It's not that these tasks actually take that long, but more-so because I only work on this project a few nights per week, and each night I do work on it, I only spend an hour or two on it. Furthermore, writing out my thoughts on every little thing is rather time-consuming for a few reasons:

  • The time it takes to type is minimal, but if I were just building the application without writing the book, I wouldn't be thinking about everything in full sentences. On the contrary, I often don't think in English at all when dealing with software. My thoughts are more organized into concepts that make connections to other concepts, much like software engineering in general. Therefore, I spend a non-negligible amount of time articulating my thoughts into a form that is understandable by a general audience.
  • Because I'm constantly switching back and forth from a programming mindset to a conversational mindset, I'm actually incurring a context-switching penalty every time I do so. It is true that writing out my thoughts helps me stay more organized with the application that I'm developing, but the context-switching penalty is significant, and happens very frequently as I switch back and forth.
  • I also find myself going into way more detail in my writing than I would ever need to go into, had I not been explaining the concept to an audience. Take the section on game loops, for example. Why did I spend any time whatsoever explaining what constitutes an "input" rather than an "output"? I already know that information, and I didn't have to go research that as part of this project. I simply felt compelled to provide that context for the benefit of the reader.

Some nights, I only write the book. Other nights, I only do research. All of this combined results in my actual coding time being squeezed down to maybe 30 minutes per week. That means I've only gotten about 2 hours of actual coding done in the last month, which sounds about right, given my current progress can be boiled down to "create an empty window on macOS, and a blank app on iOS".

Now for the kicker: it's been an entire year since I started this book, and I'm only just now getting a window to pop up on the screen. I go through phases of fixating on different projects, hobbies, and sometimes even work. To put it lightly, I struggle with attention issues. Within the last year, I learned how to build a simple calculator CPU using an FPGA, I developed a programming language that compiles into Lua, I experimented heavily with add-ons for World of Warcraft, and my current company released a new game (which always leads to a lot of overtime in this industry). I only recently got back into the flow of writing this book. I had some friends suggest that I try blogging instead, but such wide gaps between my posts would likely alienate any readers, so I'm glad I didn't go down that route. Even if this book takes many many years to finish, at least it will be one consistent experience for the reader.

In the Game

Like I mentioned in the section on game loops, the game's state is updated constantly in a loop. The loop iterates frequently such that the rendered image appears to contain moving objects within the screen - this is not a particularly "new" concept, as this is how movies have worked for over a century. What makes it cool is that the resulting image is based on an internal simulation of the game's world, and the player's actions can impact the state of that simulation (in contrast to a movie, which is the same every time you play it).

The state of the simulation can be updated as frequently as possible in order to maximize the use of the CPU - a small game could easily exceed a thousand updates per second! This doesn't necessarily do much good, since the display device has an upper limit on how often it can display those updates. Many (if not most) displays still have a refresh rate of 60Hz (1Hz == 1 refresh per second). Since I'm targeting iOS, I can just use generally available iOS devices as examples. The base iPhone 13 has a display that refreshes at a fixed 60Hz. However, the iPhone 13 Pro (the phone I currently use) has a "ProMotion" display, which has a dynamic refresh rate that can go up to 120Hz.

With the game simulation updating at another order of magnitude higher than even the best displays, those updates are effectively wasted CPU time. Many updates can occur before the next image can be displayed to the player, but all of those updates could have been compacted into a single iteration of the loop, saving CPU time, and therefore power consumption (which is especially important on mobile devices). If only there was a way to tell our simulation how much time has passed since the last iteration of the loop! This concept is known as "delta time".

It actually gets a bit more complicated than this. Rendering APIs may block until the image has been shown on the screen, or may allow you to submit as many new images as you want, and simply take the latest on each refresh.

while (isRunning) {
  update();
  render(); // Could block until the next "refresh"
}

This behavior is usually configurable via the rendering API, and is what many games refer to as "v-sync": wait until the screen actually updates before continuing - that is to say, "sync" the application logic with the VBLANK (vertical blank) interval of the output device.

This term is entirely outdated. Old display devices used an electron ray to adjust the displayed colors. This ray started in one corner of the screen and drew line by line until it reached the opposite corner of the screen. The time it took the electron beam to reset to the starting corner (move vertically back to the other edge of the screen) was referred to as the VBLANK interval. If you could finish processing a single iteration of game simulation updates within that VBLANK interval, then you were in-sync with the vertical blank of the display.

Modern display devices don't use electron rays at all, and instead determine each pixel's state using a texture stored in the graphics device's memory. The data for the pixels is transmitted digitally rather than in-sequence over an analog connection. The display does this frequently based on its refresh rate, but the beginning of each "refresh" can be thought of as synonymous with the end of the VBLANK interval on older displays. As such, we refer to syncing with a modern display's refresh rate as "v-sync", even though there is no actual VBLANK interval anymore.

In any case, disabling v-sync could result in screen "tearing", which occurs when the application has updated the texture in graphics memory at the same time that the display is refreshing. This appears as if there are two different images, torn in half and taped together, being presented on the screen.

The disadvantage of enabling v-sync is that the inputs used to update your simulation could be old - with a display which refreshes at 60Hz, your inputs could have happened up to 1/60th of a second ago! Gasp! We'll go into other rendering techniques that can further increase this "input latency" later.

If you know that your application has v-sync enabled and you know that your target device will only ever refresh at 60Hz, then you can simply assume that your delta time is equal to 1/60. Enabling v-sync can also be used as an easy way to throttle your simulation updates, in order to reduce the amount of load on the CPU.

Since we've established that some iOS devices have a variable refresh rate (via ProMotion), then, even with v-sync enabled, we can't simply hardcode our delta time. We'll therefore need to calculate the delta time for each frame, and pass it into our update() function so that we can correctly animate our world over time.

auto currentTime = getTime(); // Assume this function exists

while (isRunning) {
  auto newTime = getTime();
  auto deltaTime = newTime - currentTime;
  currentTime = newTime;

  update(deltaTime);

  render();
}

One of the biggest issues with having a variable delta time is that it becomes difficult to reason about physics. Ideally, physical simulations are integrated continuously as they do in real life, but how do you write software that can perform continuous integration? That's actually a really cool idea and I might do some reading on that topic later.

Instead, we integrate our physics simulation one step at a time. Having a fixed time step based on the VBLANK interval was actually a blessing in this area, since the physical simulations would at least be consistent frame-to-frame, and it would be easy to calculate the limitations of your physical simulation based on your known delta time.

Consider two circles in our 2D world which should not be able to pass through one another, each with a 1/2-unit radius. If we move one circle toward the other at a velocity of 1-unit per second, and our delta time is 2 seconds, then the moving circle will have moved 2 entire units within one frame, and could very well have magically jumped over the static circle.

If we reduce our delta time to 1 second, then the chance of our moving circle magically passing through the static circle is drastically reduced, although it could still happen. If our delta time was reduced down to just 0.99 seconds, then our circles would correctly collide every time. However reducing the size of one of the circles slightly, or increasing the velocity of the moving circle, would again increase our chances of a false-negative collision.

In order to support physical simulations containing geometry of varying sizes and velocities, certainly we would prefer a very small time step, even if that means the time step was different from frame-to-frame, right? If the time step is variable, then it could also be arbitrarily large, and in that case, it becomes hard to define any rules for how small our objects can be, or how fast they can go. Even if the time steps weren't particularly large, the player may experience inconsistent behavior and unexpected consequences to their actions. It is therefore best to handle physical simulations using a fixed time step.

Multiplayer games in which the physics is simulated on different devices must use a fixed time step in order to remain as deterministic as possible. The use of a central server can true up any inconsistencies, but with variable time steps between the clients, those inconsistencies can be drastically different. In the case of a deterministic simulation, the only inconsistencies that should happen would be the result of floating point precision issues or network communication failures.

I guess I don't have to convince you that supporting a fixed time step for the physical simulation is a good idea. This is my game after all, so I'll just show you how to support it along with a variable delta time for rendering. If you're super interested in this stuff, check out "Fix Your Timestep!" by Glenn Fiedler - it is very informative and quite popular in the game development community.

First, we have to define our constant time step for our physical simulation. Unity defaults to 1/50th of a second (0.02s). Other engines I've used default to 1/60th of a second, which is harder to represent as a decimal, but comes from dealing with 60Hz refresh rates for so long. Networked games might use a larger time step in order to reduce the amount of state updates that need to be propagated across the wire, and then use interpolation to fill in the gaps between each step. For now, I'll just use Unity's default of 1/50th of a second.

We will accumulate the delta time across multiple iterations of our game loop, and as our accumulator exceeds our target fixed time step interval, we will tick our physics simulation, and deduct the fixed interval from the accumulator.

const auto fixedDeltaTime = 0.02;

auto currentTime = getTime();
auto accumulator = 0.0;

while (isRunning) {
  auto newTime = getTime();
  auto deltaTime = newTime - currentTime;
  currentTime = newTime;
  accumulator += deltaTime;

  while (accumulator >= fixedDeltaTime) {
    fixedUpdate(fixedDeltaTime);
    accumulator -= fixedDeltaTime;
  }

  update(deltaTime);

  render();
}

Using this pattern, our fixedUpdate() can be used for all objects that interact with the physical simulation, and update() can be used for everything else - visual updates, input handling, audio queues, etc.

There is still a problem with this solution though: what happens to the remainder of the accumulator during each render()? If we leave that remainder sitting around until it eventually exceeds the fixed time step interval, then our physical objects will appear to hop across the screen!

Glenn Fiedler's solution is to actually run the physics simulation one "tick" behind, and use the accumulator's current value (as a percentage of the constant fixed time step interval) to interpolate between the last two known states of the simulation. Doing this allows all of our physics objects to move smoothly throughout our game world as if they had been using the variable time step, but results in input latency equal to our fixed time step interval. We will work on this interpolation strategy (as well as ways to mitigate the input latency) on a per-object basis later on.

I created a couple of delta time accumulators and frame counters, and added some code to update() and fixedUpdate() to spit out the number frames my application has processed in the last second:

float _dtAccumulator = 0.0f;
int _updateCounter = 0;

void update(float deltaTime) {
  _dtAccumulator += deltaTime;
  _updateCounter++;

  while (_dtAccumulator >= 1.0f) {
    std::cout << "update(): " << _updateCounter << " fps" << std::endl;

    _dtAccumulator -= 1.0f;
    _updateCounter = 0;
  }
}

float _fdtAccumulator = 0.0f;
int _fixedUpdateCounter = 0;

void fixedUpdate(float fixedDeltaTime) {
  _fdtAccumulator += fixedDeltaTime;
  _fixedUpdateCounter++;

  while (_fdtAccumulator >= 1.0f) {
    std::cout << "fixedUpdate(): " << _fixedUpdateCounter << " fps" << std::endl;

    _fdtAccumulator -= 1.0f;
    _fixedUpdateCounter = 0;
  }
}

Here is a sample of what I'm seeing (I've added gaps in the lines to convey that the logs are being shown once per second):

update(): 370637 fps
fixedUpdate(): 50 fps

update(): 372748 fps
fixedUpdate(): 50 fps

update(): 372175 fps
fixedUpdate(): 50 fps

update(): 372103 fps
fixedUpdate(): 50 fps

Over 370,000 frames per second might seem like a lot, but this application isn't really doing anything yet. Once we start submitting draw calls to the GPU, this number will go down substantially. If we use v-sync, then this number won't go any higher than 120.

More importantly, our fixedUpdate() is being processed at a consistent 50 frames per second, even though the main update() is getting called substantially more frequently.

What happens if we artificially throttle our update() to process at under 50 frames per second? I'll add std::this_thread::sleep_for(std::chrono::milliseconds(50)); to the end of my update() function so that it only processes about 20 frames per second.

update(): 19 fps
fixedUpdate(): 50 fps

fixedUpdate(): 50 fps
update(): 20 fps

update(): 19 fps
fixedUpdate(): 50 fps

fixedUpdate(): 50 fps
update(): 20 fps

Perfect! The first iteration of our game loop takes about 50 milliseconds, so when we start the second iteration, our fixedUpdate() function (which we've designed to run every 20 milliseconds) is called twice, and we keep 10 milliseconds in the _fdtAccumulator. During the third iteration, the accumulator is up to 60 milliseconds (the remaining 10 + the 50 from the update), so fixedUpdate() is called three times, with a remainder of 0. This pattern continues indefinitely such that the update() function alternates logging its framerate before fixedUpdate() (on the 19th frame), and after fixedUpdate() (on the 20th frame). In any case, fixedUpdate() is consistently being called 50 times per second.

One last test: what happens if update() is called less than once per second? I'll increase the sleep to 1250 milliseconds.

fixedUpdate(): 50 fps
update(): 1 fps

fixedUpdate(): 50 fps
update(): 1 fps

fixedUpdate(): 50 fps
fixedUpdate(): 50 fps
update(): 1 fps
update(): 0 fps

fixedUpdate(): 50 fps
update(): 1 fps

Now this just feels like magic. The accumulator is doing its job beautifully, making sure to tick fixedUpdate() however many times it takes to maintain that sweet 50 frames per second, even when the frame rate for update() drops below 1 per second. The crazy part is that, due to the sleep, fixedUpdate() might not actually get invoked for longer than 1 second. How then is it reporting a consistent 50 frames per second? Once our game loop finally comes around to comparing the accumulator to the fixed delta time constant, it actually gets invoked many times based on how long it has been since the last time it was invoked - perhaps more than 50 times back-to-back! This doesn't actually matter, because from the user's perspective, it still got called the correct amount of times once they were able to observe the result. You can imagine that we displayed the world on the screen right after we called update(), once every 1250 milliseconds. Because the fixedUpdate() is still being called the correct amount of times, it doesn't actually matter how choppy the animation is. The objects in the game world will continue to move around at the expected rates, based on the amount of time passing by, regardless of the slow frame rate. Even more importantly, the objects will still interact with one another correctly even though we didn't actually render the frame in which the collision first occurred! Fascinating, really.

2.6 Cleanup Time

With my window opening, and my game loop ticking properly, I think now is a good time to take a beat and combine the two using abstractions that can be implemented on a per-platform basis.

Within Linguine, I'll add a handful of headers, which should be somewhat self-explanatory:

linguine/
  include/
    Engine.h
    InputManager.h
    LifecycleManager.h
    TimeManager.h

I'll also implement the Engine class within Linguine using the interfaces defined in the other headers. This class contains a run() method that handles our game loop, as well as the update() and fixedUpdate() methods I showed you earlier. Each platform (Alfredo and Scampi, for now) can provide their own implementations for the interfaces, and simply construct an instance of Engine with instances of those implementations in their main() method (or whatever the platform equivalent is).

The InputManager interface simply contains one method for the time-being: pollEvents(). This method is responsible for receiving all of the most recent events from the operating system. At this point, we just forward all of the events to the NSApplication, which allows proper handling of the window (such as the close button) and the overall application process (such as the menu bar). Later on, we'll need to detect events that our game cares about, convert them to a form that the game can understand, and allow the game logic to query the state of those events.

The LifecycleManager interface currently contains an isRunning() method, so that our game knows when to stop iterating its game loop. I'm not actually sure what sort of nuances we might encounter with the mobile lifecycle - for example, what should our game do when we receive a phone call? - but at least we have a place to put that sort of logic.

The TimeManager interface can query the current time from the operating system, and determine how much time has passed between two timestamps. I'm making a platform-specific abstraction here because different operating systems use different APIs for determining the time. Again, we can extend this later as needed.

I'd kind of like to show the code I've written so far, but "proper" C++ has two files for every class, and that makes it really annoying to show in a book. I'll just show you the implementation of Engine::run():

linguine/src/Engine.cpp

void linguine::Engine::run() {
  const auto fixedDeltaTime = 0.02f;

  auto currentTime = _timeManager->currentTime();
  auto accumulator = 0.0f;

  while (_lifecycleManager->isRunning()) {
    _inputManager->pollEvents();

    auto newTime = _timeManager->currentTime();
    auto deltaTime = _timeManager->durationInSeconds(currentTime, newTime);
    currentTime = newTime;
    accumulator += deltaTime;

    while (accumulator >= fixedDeltaTime) {
      fixedUpdate(fixedDeltaTime);
      accumulator -= fixedDeltaTime;
    }

    update(deltaTime);

    // TODO render();
  }
}

If you're interested in the rest of the project code at this point, check out the state of the GitHub repository at this commit. I literally just created the repository just to link to it in this chapter. I spent some time trying to decide between the Apache License 2.0 or the MIT License. Generally I use the MIT License for my personal projects, but I'll go with Apache License 2.0 this time around, since it provides more explicit patent rights, but is still very permissive for others to learn from and use as they see fit. In any case, you might notice that the commit I've listed here is actually the first commit in that repository - I really should have created the repo earlier, but the project structure still felt very malleable. Now that I've learned more about how to interact with the platform APIs using Objective-C++, I'm much more confident about the structure of the project.