Audio

Sound effects and music are often ignored by indie developers until very late in the process. It is, after all, completely possible to prototype a game without having any sound at all. In the interest of completing the "engine" phase of this project, I'm going to go ahead and add audio capabilities, even though I likely won't use them at all for early prototyping.

Adding the ability to play audio isn't particularly difficult, but to be honest, I've never actually implemented the functionality into any of my game engines before. That's not to say that I haven't worked on projects that heavily utilized audio, just that they weren't game engines.

Even though it's often an afterthought, audio adds an incredible level of immersion to any game. Even to this day, I commonly listen to the soundtracks of old video games as background music while I work. At this very moment, I'm listening to a piano rendition of music from the Metroid games by an artist named Chewie. My wife is far from a "gamer", but even she has recently been enjoying listening to music from The Legend of Zelda series (and I couldn't be more proud). Playing the ocarina in Ocarina of Time was a completely satisfying game mechanic on its own, often worth ignoring the quest to defeat Ganon, if only just for a moment.

Unfortunately, it's much easier to ignore audio in mobile games, unless the game is specifically about the sound, such as the Magic Tiles series. The reality is that the casual nature of mobile games means that people will often play those games in areas where it is socially unacceptable to have noises constantly emanating from their phones. Even ringtones have fallen out of favor, with many people simply leaving their phones silenced. When the average person downloads a new game for their phone, the first thing they do when they open it is find the "mute" option. If that option is not readily available, they just turn their phone's volume all the way down. And yet, it is expected for a complete game to include a soundtrack and convincing sound effects. Even someone who routinely mutes games would still find it jarring to encounter a game which didn't have any audio at all.

This chapter is not about writing music or engineering cool sound effects, any more than the rendering chapter was about creating visual art. We just need our engine to be able to play specific sounds on demand. Since I've never done this within the context of a game engine before, I'm not even sure what specific goals I would like to set for myself. Let's just get started and see what kind of problems we run into.

7.1 Platform Support

The first thing we have to do is decide which audio API to use.

OpenAL is a common choice for video games, due to its support for spatial audio sources in a 3D world relative to a central "listener". The API itself is technically an open standard and free to use, but each platform has separate implementations that may or may not be free. Apple provides a free OpenAL implementation on macOS and iOS backed by their own Core Audio framework, but it appears that it was deprecated in macOS 10.15 "Catalina", with no information about when it will be completely unsupported. This isn't a big surprise to me, since they did the same for the OpenGL rendering API (in favor of their own Metal API), but it does make it a non-starter for us.

Since deprecating OpenAL, Apple has stated that they prefer developers to use their Audio Engine API, which is also backed by Core Audio. Unlike Metal, however, they haven't provided any C++ bindings for Audio Engine, which means we'll have to use "Objective-C++".

There doesn't really seem to be any other easily discoverable alternative, so it seems like this decision has been made for me.

The AudioManager

Just like any of our platform-specific abstractions, I'll start by creating an interface within Linguine. I have no idea what methods I'll need, but we'll go ahead an add the base class so we can inherit from it elsewhere.

linguine/include/AudioManager.h

#pragma once

namespace linguine {

class AudioManager {
  public:
    virtual ~AudioManager() = default;
};

}  // namespace linguine

In order for Alfredo and Scampi to share AudioManager implementations, I'll need to create a new CMake sub-project, in the same way that I did with renderers/metal/. Obviously this will also be how we implement different types of audio implementations for use on other platforms, if we need to in the future. I'll be creating a new AudioEngineAudioManager within the audio/audioengine/ sub-project. The CMake configuration for the new project is pretty straight-forward, though it took me a while to figure out which library I needed to link to.

audio/audioengine/CMakeLists.txt

enable_language(OBJCXX)

add_library(audioengine
        src/AudioEngineAudioManager.mm
)

target_include_directories(audioengine PUBLIC include)
target_include_directories(audioengine PRIVATE src)

target_link_libraries(audioengine PRIVATE linguine)

find_library(AVFOUNDATION_LIBRARY AVFoundation REQUIRED)
target_link_libraries(audioengine PRIVATE ${AVFOUNDATION_LIBRARY})

I updated the Engine to require an AudioManager shared pointer. After creating the AudioEngineAudioManager class with an empty constructor, I was able to link to the new audioengine library and construct an instance of the new class from both Alfredo and Scampi to provide to the Engine. Both apps can compile and run just fine without complaining about any linking errors.

Starting with the documentation for the AVAudioPlayer, I followed the link entitled "Configuring the Audio Playback of iOS and tvOS Apps" and followed the instructions to create an AVAudioSession within my new audio manager implementation. Upon building Alfredo, I was greeted with this compiler error:

error: 'AVAudioSession' is unavailable: not available on macOS

Apparently I didn't read the name of the guide closely enough: an AVAudioSession isn't required for macOS! Apparently I'll need to create the session in Scampi, but Alfredo won't require it at all. I added the code to activate the session to the ScampiViewController, before constructing the audio manager, though it's unclear to me when I'm supposed to deactivate it.

auto audioSession = [AVAudioSession sharedInstance];

NSError* error;
if (![audioSession setCategory:AVAudioSessionCategoryPlayback
                         error:&error]) {
  NSLog(@"%@", [error localizedDescription]);
  return;
}

if (![audioSession setActive:true
                 withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation
                       error:&error]) {
  NSLog(@"%@", [error localizedDescription]);
  return;
}

Man I really hate Objective-C syntax, but here we are.

Keeping the ball rolling, I found the documentation for the AVAudioEngine, and followed the guide. The examples are written in Swift, even though my web page is set to prefer Objective-C. Oh well, it's easy enough to translate it myself. The guide also glosses over the creation of an AVAudioFile. I need to know which format to use in order to connect the AVAudioPlayerNode to the AVAudioEngine's output node, but I won't know that until I determine what types of files I'm loading... or if I can get away with delaying that decision for now.

Audio Formats

Old game consoles made heavy use of dedicated digital signal processing (DSP) hardware in order to generate audio at runtime, based on instructions provided by the game. These synthesized sounds are highly recognizable due to the high regularity in the sound waves generated by the hardware. Sine waves, "square" waves, and "sawtooth" waves commonly make up the melodies of the tunes generated by these chips, with percussive sounds being made up of quick bursts of "white noise". Some people love this synthesized sound so much that an entire genre has developed around recreating it - the songs are appropriately described as "chip tunes".

It's entirely possible to generate these types of waves at runtime using the AVAudioEngine, and Apple has published an example project which does just that. If I were to build my engine to utilize such a technique, I would have to painstakingly craft programs which created the appropriate signal at the right time in order to create music. This is exactly what game developers had to do in the days of dedicated DSP hardware, and it's quite tedious.

Modern hardware is capable of outputting arbitrary audio sourced from files, or even network streams. Even if I wanted to create a soundtrack exclusively consisting of retro-sounding "chip tunes", it would be much easier to create those songs in (relatively) modern applications like Avid Pro Tools or FL Studio, export them to files, and play them in my game by loading the files.

The same logic goes for pixel art - I have already demonstrated that my renderer is not limited in the types of shapes it's capable of drawing, but if I wanted to utilize a pixel art style, then I would create that art outside of the engine, and load it into the engine from files. Unlike the renderer, however, I don't need to write any audio-specific sub-programs that gets submitted to dedicated hardware at runtime. All I need to do is load the file, and tell the audio API to start playing it.

A thought occurs: I haven't bothered to load any files in my engine yet - everything has been entirely hard-coded at some layer. I'll just see how far I can get without having to write a file-loading abstraction.

Playing a Clip

I've been slowly whipping up the code to get audio clips to play over the last several days. A lot of it has been learning the order in which the API expects thing to happen - it is heavily stateful, after all. For the most part, it's fairly obvious when you've done something wrong (the app crashes), and the error messages are clear. Here are some things I've learned along the way:

The AVAudioEngine lazily creates its internal resources, and you have to access either the mainMixerNode or the outputNode of the object in order for those resources to be initialized.
Attempting to "start" the audio engine before those resources are created will result in an error.
You have to manually create your own nodes (AVAudioPlayerNode in my case), and "attach" them to the audio engine before they are part of the internal graph.
Furthermore, you have to manually "connect" your nodes to some other node within the graph using the audio engine's API.
You cannot call "play" on a node until it has been attached to an audio engine and that audio engine has been started.
"Playing" a node doesn't necessarily start any audio - you must also "schedule" some audio to be played for that node. You can "pause" a node while it's actively playing audio, and resume it later by calling "play" again. You can also "stop" the node, which clears its scheduled audio entirely.
You can set your own completionHandler for scheduling operations, and can respond to "consumed", "rendered", or "played back" events.
The completionHandler does not necessarily run on the main thread, so mutating any state will inevitably cause problems.
You must "stop" all of the nodes when you are done with them. You must also "stop" the audio engine when you're done with it.

The documentation states that you may create new nodes and attach them to the graph after the audio engine has already been started. Attempting to "play" those new nodes, however, results in somewhat random memory corruption. I tried to enable address sanitation for Alfredo (by adding set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsanitize=address -g -O1") to its CMake configuration) in order to pinpoint the source of the corruption, but the errors do not occur when I do that, which likely means there's some sort of race condition, and enabling the address sanitation slows things down sufficiently enough to avoid it.

Lucky for us, I won't be adding new nodes to an already-started audio engine - I was only doing so as a lazy way to get started. Instead, I'll create a finite number of nodes prior to starting the audio engine. If I run out of "available" nodes at runtime, then a requested sound just won't be able to play.

Throughout my debugging journey, I discovered that Objective-C's automatic reference counting (ARC) was not actually enabled in Alfredo, because it does not get compiled using the Xcode generator like Scampi does. This means I was leaking memory all over the place in my Objective-C code because I was never manually calling release on my objects! To remedy this, I've added the following CMake configuration to Alfredo:

alfredo/CMakeLists.txt

set(CMAKE_OBJC_FLAGS "-fobjc-arc")
set(CMAKE_OBJCXX_FLAGS "-fobjc-arc")

Yes, it really is that easy. I can't believe I didn't realize that before.

Obviously my game doesn't have any audio assets yet, but in order to test the functionality, I've downloaded a file named Balloon Pop 1.wav from a project named "8-bit / 16-bit Sound Effects (x25) Pack" on itch.io by the user JDWasabi. It's the perfect clip to play for object destruction. I've added the file to a new assets/audio/ directory in my repository.

There are a handful of ways to copy these files into the build output directory, but I'm going to use CMake's configure_file.

alfredo/CMakeLists.txt

file(GLOB RESOURCE_FILES
        ${CMAKE_SOURCE_DIR}/assets/**/*)

foreach(RESOURCE_FILE IN LISTS RESOURCE_FILES)
    configure_file(${RESOURCE_FILE} ${CMAKE_CURRENT_BINARY_DIR} COPYONLY)
endforeach()

Scampi's configuration is similar, but ios-cmake already provides a way to copy "resource" files into an application bundle, in the same way we copy our storyboard files.

scampi/CMakeLists.txt

file(GLOB RESOURCE_FILES
        ${CMAKE_SOURCE_DIR}/assets/**/*)

set(RESOURCES
        resources/LaunchScreen.storyboard
        resources/Main.storyboard
        ${RESOURCE_FILES}
)

Unfortunately this is not sufficient to consolidate the file loading differences between Alfredo and Scampi, but I'll tackle that in a little while. Here's what I've whipped up so far.

audio/audioengine/include/AudioEngineAudioManager.h

#pragma once

#include <AudioManager.h>

#include <mutex>
#include <queue>

#import <AVFoundation/AVAudioEngine.h>
#import <AVFoundation/AVAudioFile.h>
#import <AVFoundation/AVAudioPlayerNode.h>

namespace linguine::audio {

class AudioEngineAudioManager : public AudioManager {
  public:
    AudioEngineAudioManager();

    ~AudioEngineAudioManager() override;

    void play() override;

  private:
    static constexpr uint8_t _maxChannels = 32;

    AVAudioEngine* _audioEngine;
    AVAudioFile* _fileToPlay;

    NSMutableArray<AVAudioPlayerNode*>* _playerNodes;

    std::queue<AVAudioPlayerNode*> _nodePool;
    std::mutex _poolMutex;

    AVAudioPlayerNode* getPlayerNode();
};

}  // namespace linguine::audio

audio/audioengine/src/AudioEngineAudioManager.mm

#include "AudioEngineAudioManager.h"

#import <AVFoundation/AVAudioMixerNode.h>

namespace linguine::audio {

AudioEngineAudioManager::AudioEngineAudioManager()
    : _audioEngine([[AVAudioEngine alloc] init]),
      _playerNodes([[NSMutableArray alloc] init]) {
  auto outputFormat = [_audioEngine.outputNode inputFormatForBus:0];
  auto inputFormat = [[AVAudioFormat alloc] initWithCommonFormat:outputFormat.commonFormat
                                                  sampleRate:outputFormat.sampleRate
                                                    channels:1
                                                 interleaved:outputFormat.isInterleaved];

  [_audioEngine connect:_audioEngine.mainMixerNode
                     to:_audioEngine.outputNode
                 format:outputFormat];

  for (auto i = 0; i < _maxChannels; ++i) {
    auto playerNode = [[AVAudioPlayerNode alloc] init];
    [_playerNodes addObject:playerNode];

    [_audioEngine attachNode:playerNode];
    [_audioEngine connect:playerNode
                       to:_audioEngine.mainMixerNode
                   format:inputFormat];

    _nodePool.push(playerNode);
  }

  NSError* error;

  if (![_audioEngine startAndReturnError:&error]) {
    NSLog(@"%@", error.localizedDescription);
    return;
  }

  for (AVAudioPlayerNode* playerNode in _playerNodes) {
    [playerNode play];
  }

  auto url = [NSURL fileURLWithPath:@"Balloon Pop 1.wav"];
  //auto url = [[NSBundle mainBundle] URLForResource:@"Balloon Pop 1" withExtension:@"wav"];
  _fileToPlay = [[AVAudioFile alloc] initForReading:url
                                              error:&error];

  if (error) {
    NSLog(@"%@", [error localizedDescription]);
    return;
  }
}

AudioEngineAudioManager::~AudioEngineAudioManager() {
  for (AVAudioPlayerNode* playerNode in _playerNodes) {
    [playerNode stop];
  }

  [_audioEngine stop];
}

void AudioEngineAudioManager::play() {
  auto playerNode = getPlayerNode();

  if (playerNode) {
    [playerNode scheduleFile:_fileToPlay
                        atTime:nil
        completionCallbackType:AVAudioPlayerNodeCompletionDataPlayedBack
             completionHandler:^(AVAudioPlayerNodeCompletionCallbackType callbackType) {
               std::unique_lock<std::mutex> lock(_poolMutex);
               _nodePool.push(playerNode);
             }];
  }
}

AVAudioPlayerNode* AudioEngineAudioManager::getPlayerNode() {
  if (_nodePool.empty()) {
    return nullptr;
  }

  std::unique_lock<std::mutex> lock(_poolMutex);

  auto result = _nodePool.front();
  _nodePool.pop();

  return result;
}

}  // namespace linguine::audio

The constructor sets up 32 channels and adds them to a queue of available nodes. The rest of the constructor is just boilerplate for setting up and starting the audio engine with multiple input nodes, utilizing a mixer. The only thing I'd like to highlight is the commented out url, which uses [NSBundle mainBundle] - this is how iOS will load the file, but I'll have to extract that detail later.

The getPlayerNode() method grabs an available node from the pool of available nodes, if one is available. Since getting a player node is guaranteed to be single-threaded by the rest of the engine, I don't actually have to lock on the empty() check. Mutating the node pool, on the other hand, requires a lock, because the completionHandler within the play() method is not guaranteed to run on the main thread. Both of these functions rely on the lock being released by going out of scope, which is weirdly common in the C++ world, but allows you to "forget" to release the mutex.

Locking in this way is not particularly performant, but once again, if it proves to be a problem, then we'll revisit it later and optimize.

Finally, we can play the audio clip from our SelectionDestructionSystem by injecting the AudioManager.

linguine/src/systems/SelectionDestructionSystem.cpp

void SelectionDestructionSystem::update(float deltaTime) {
  findEntities<LongPressed>()->each([this](Entity& entity) {
    _audioManager.play();
    entity.destroy();
  });
}

I realize that I cannot illustrate the functionality to you, but you can imagine the sound of a balloon popping every time an object is destroyed by long-pressing it - and it works!

7.2 File Loading Abstraction

I've cut a lot of corners with regard to file handling in this engine. The renderer doesn't store its meshes in model files, nor do the shaders live in dedicated files. When a game system wants to render a quad, it doesn't ask the renderer to create the quad mesh, then turn around and ask it to draw the mesh that was just created. Instead, the quad mesh is one of a few shapes supported by the renderer, and the game system just needs to set the MeshType to Quad in the appropriate render feature.

This style is not common for general purpose game engines, and certainly not suitable for an engine that needs to mix and match meshes, textures, and render pipelines in arbitrary ways - but it works perfectly fine for our very narrow-focused engine. Rather than passing arbitrary file paths or file handles into these disparate sub-systems, hoping the path is valid, and handling any error that occurs, all we have to do to support something new is add an enum value and implement the backend support for it.

I intend to use the same focused enum-based approach with our AudioManager abstraction. This way, I don't have to actually worry about creating a FileManager that gets used throughout the engine, nor do I need to support the Renderer or AudioManager receiving references to the resulting file handles.

Rather than creating a global file manager, I will simply create an abstraction specific to this audio manager, which I will call the AudioEngineFileLoader. Just like any of the other platform-level abstractions, it will be the job of Alfredo and Scampi to construct a suitable file loader implementation and pass it to the AudioEngineAudioManager during setup.

The AudioEngineFileLoader base class is responsible for creating an AVAudioFile from an NSURL, but it will be the job of each platform implementation to define how the NSURL gets created for each clip.

Audio clips are an interesting topic because they can represent different types of audio. The "pop" clip that we've used so far is a "sound effect" clip, but it is not unreasonable to assume that we will need "music" clips in the future. For now, I'll ignore the music use case, but I will create an EffectType enum within Linguine. To better organize the audio-related functionality within the engine, I'll create a new audio/ sub-directory, and move the AudioManager into it, along with the new enum.

linguine/include/audio/EffectType.h

#pragma once

namespace linguine {

enum EffectType {
  Pop
};

}  // namespace linguine

I've also updated the AudioManager to require an EffectType parameter for its play() method.

linguine/include/audio/AudioManager.h

#pragma once

#include "EffectType.h"

namespace linguine {

class AudioManager {
  public:
    virtual ~AudioManager() = default;

    virtual void play(EffectType effectType) = 0;
};

}  // namespace linguine

In order to preserve the functionality of the SelectionDestructionSystem, I'll need to pass the correct EffectType to the play() method.

linguine/src/systems/SelectionDestructionSystem.cpp

void SelectionDestructionSystem::update(float deltaTime) {
  findEntities<LongPressed>()->each([this](Entity& entity) {
    _audioManager.play(EffectType::Pop);
    entity.destroy();
  });
}

Obviously, I'll have to update the AudioEngineAudioManager to receive the EffectType parameter in its implementation of the play() method. As it is implemented right now, the AudioEngineAudioManager initializes its own _fileToPlay member variable, which gets played each time the play() method is called. I'll go ahead and remove that variable entirely, instead retrieving the desired file from the file loader.

audio/audioengine/include/AudioEngineFileLoader.h

#pragma once

#include <unordered_map>

#import <AVFoundation/AVAudioFile.h>

#include <audio/EffectType.h>

namespace linguine::audio {

class AudioEngineFileLoader {
public:
  virtual ~AudioEngineFileLoader() = default;

  AVAudioFile* getAudioFileForEffect(EffectType effectType);

  virtual NSURL* getUrlForEffect(EffectType effectType) = 0;

private:
  std::unordered_map<EffectType, AVAudioFile*> _loadedFiles;
};

}  // namespace linguine::audio

audio/audioengine/include/AudioEngineAudioManager.h

#pragma once

#include "audio/AudioManager.h"

#include <mutex>
#include <queue>

#import <AVFoundation/AVAudioEngine.h>
#import <AVFoundation/AVAudioFile.h>
#import <AVFoundation/AVAudioPlayerNode.h>

#include "AudioEngineFileLoader.h"

namespace linguine::audio {

class AudioEngineAudioManager : public AudioManager {
  public:
    explicit AudioEngineAudioManager(std::unique_ptr<AudioEngineFileLoader> fileLoader);

    ~AudioEngineAudioManager() override;

    void play(EffectType effectType) override;

  private:
    static constexpr uint8_t _maxChannels = 32;

    std::unique_ptr<AudioEngineFileLoader> _fileLoader;
    AVAudioEngine* _audioEngine;
    NSMutableArray<AVAudioPlayerNode*>* _playerNodes;

    std::queue<AVAudioPlayerNode*> _nodePool;
    std::mutex _poolMutex;

    AVAudioPlayerNode* getPlayerNode();
};

}  // namespace linguine::audio

audio/audioengine/src/AudioEngineAudioManager.mm

void AudioEngineAudioManager::play(EffectType effectType) {
  auto playerNode = getPlayerNode();

  if (playerNode) {
    auto file = _fileLoader->getAudioFileForEffect(effectType);

    [playerNode scheduleFile:file
                        atTime:nil
        completionCallbackType:AVAudioPlayerNodeCompletionDataPlayedBack
             completionHandler:^(AVAudioPlayerNodeCompletionCallbackType callbackType) {
               std::unique_lock<std::mutex> lock(_poolMutex);
               _nodePool.push(playerNode);
             }];
  }
}

Makes sense so far. In Alfredo, we'll need to add a new MacAudioEngineFileLoader class.

alfredo/src/platform/MacAudioEngineFileLoader.h

#pragma once

#include <AudioEngineFileLoader.h>

#import <Foundation/NSURL.h>

namespace linguine::alfredo {

class MacAudioEngineFileLoader : public audio::AudioEngineFileLoader {
  public:
    NSURL* getUrlForEffect(EffectType effectType) override;
};

}  // namespace linguine::alfredo

alfredo/src/platform/MacAudioEngineFileLoader.mm

#include "MacAudioEngineFileLoader.h"

namespace linguine::alfredo {

NSURL* MacAudioEngineFileLoader::getUrlForEffect(EffectType effectType) {
  NSString* path;

  switch (effectType) {
  case Pop:
    path = @"Balloon Pop 1.wav";
    break;
  }

  if (path) {
    return [NSURL fileURLWithPath:path
                      isDirectory:false];
  }

  return nil;
}

}  // namespace linguine::alfredo

alfredo/src/main.mm snippet

auto audioFileLoader = std::make_unique<MacAudioEngineFileLoader>();
auto audioManager = std::make_shared<audio::AudioEngineAudioManager>(std::move(audioFileLoader));

This is where the switch statement lives for mapping effect types to file paths. This file can definitely get annoyingly large as a game gets more and more possible sound effects, but it's not so bad when you know exactly where that mapping is handled. Once again, if it becomes a problem, we can always revisit it later.

Scampi has a nearly identical header for its IosAudioEngineFileLoader, but the implementation differs to utilize NSBundle's way of creating URLs for resources.

scampi/src/platform/IosAudioEngineFileLoader.mm

#include "IosAudioEngineFileLoader.h"

namespace linguine::scampi {

NSURL* IosAudioEngineFileLoader::getUrlForEffect(EffectType effectType) {
  NSString* name;
  NSString* extension;

  switch (effectType) {
    case Pop: {
      name = @"Balloon Pop 1";
      extension = @"wav";
      break;
    }
  }

  if (name && extension) {
    return [[NSBundle mainBundle] URLForResource:name
                                   withExtension:extension];
  }

  return nil;
}

}  // namespace linguine::alfredo

scampi/src/uikit/ScampiViewController.mm snippet

auto audioFileLoader = std::make_unique<linguine::scampi::IosAudioEngineFileLoader>();
auto audioManager = std::make_shared<audio::AudioEngineAudioManager>(std::move(audioFileLoader));

In both of these implementations, if I ever forget to add support for a specific EffectType, then attempting to use the sound effect will cause the application will crash in spectacular fashion from referencing a null pointer. It would be preferable if I had some sort of compiler exception, so that I could never forget - or better yet, code generation based on the files in my assets folder! It's not particularly difficult to achieve, but I'm not going to over-engineer this solution today.

7.3 Adding a New Clip

The value of creating this entire abstraction can only be judged by how "easy" it is, as a developer, to add a new sound effect to the game. Let's give it a shot.

I'm going to add a "selection" sound effect. I'll use the clip named Select 1.wav from the same sound effects pack from earlier. After copying that file into my assets/audio/ directory, I need to refresh my CMake project in order for the configure_file directive to be invoked for the new file. I'm probably going to forget about that step every time I go to add a new asset, but I likely won't do anything about it as long as I can quickly resolve the problem.

First, we need to add the new EffectType value.

linguine/include/audio/EffectType.h

#pragma once

namespace linguine {

enum EffectType {
  Pop,
  Select
};

}  // namespace linguine

Then we need to add the switch cases to the various file loaders.

alfredo/src/platform/MacAudioEngineFileLoader.mm snippet

case Select:
  path = @"Select 1.wav";
  break;

scampi/src/platform/IosAudioEngineFileLoader.mm snippet

case Select: {
  name = @"Select 1";
  extension = @"wav";
}

Finally, I can play the new sound effect from the various game systems. I'll inject the AudioManager into the RiserSystem and the RotatorSystem, and play the new sound effect from them both.

linguine/src/systems/RiserSystem.cpp

findEntities<Rising, Tapped>()->each([this](Entity& entity) {
  const auto speed = entity.get<Rising>()->speed;
  entity.remove<Rising>();

  auto falling = entity.add<Falling>();
  falling->speed = speed;

  _audioManager.play(EffectType::Select);
});

linguine/src/systems/RotatorSystem.cpp

findEntities<Rotating, Tapped>()->each([this](const Entity& entity) {
  const auto rotating = entity.get<Rotating>();
  rotating->speed = -rotating->speed;
  _audioManager.play(EffectType::Select);
});

That was easy! Clicking on a Rising or Rotating object now additionally plays the selection audio clip - it sounds like a "clink", like hitting a small glass tube with a spoon. Once you've clicked a Rising object, the Rising component is removed, and is replaced with a Falling one, so clicking on it no longer plays the sound, as intended.

7.4 File Streaming (and a Lack Thereof)

Audio files can be very large, and can eat up a lot of memory, which is a precious resource to a game engine. Many engines contain file streaming systems, capable of loading and playing audio files incrementally while dynamically unloading the parts of the files that have already been played.

This engine explicitly does not contain such a feature. As far as I can tell, the Audio Engine API is already doing that for me within an AVAudioPlayerNode, whenever I schedule an AVAudioFile to be played. The documentation does not explicitly state this, however, and therefore I cannot be sure whether or not that is the case.

File streaming adds a lot of complexity to this otherwise very simple audio player, so I'm going to defer implementing it for now. The audio clips I've chosen are 100 KiB and 114 KiB in size, and a modern smartphone has many gigabytes of memory available (though the operating system may impose strict limits over what a single app is allowed to use). I think I could get away with keeping hundreds, if not thousands of sound effects entirely in-memory if I needed to, without streaming at all.

Music tracks are much larger than sound effects, due to their long form, but you need much fewer of them loaded at a time - usually only 1 or 2. These tracks can be many megabytes in size. Compressed MP3 files are generally under 10 MiB, even at higher sample rates. Uncompressed WAV files can easily be 10x the size of an MP3, and I wouldn't recommend using them for long-form tracks.

The size of the shippable application will also play a factor in what the engine needs to support. I definitely don't want to require users to download hundreds of megabytes for an otherwise very simple game. Time will tell what the package size ends up being, but I don't necessarily want audio to be the majority of it - unless it's a game about the music, I suppose. In any case, I'm going to completely ignore all of this for now.

One thing that stands out to me when comparing the audio manager to the renderer: audio playback is very event-driven, while rendering is more of a continuous process. It's somewhat easy to say "draw a triangle in this position, with this rotation, filled with this color", while giving it different parameters for each frame. As long as the positional and rotational adjustments are made with the delta time taken into account, then a fluctuating frame rate is still convincingly accurate to the user.

Audio is different. Sample rates for any given track are very specific. If you were to try to play a track "frame by frame" with a variable delta time between frames, the result would be terrible. This is why the actual playback of the audio file occurs on a separate thread from the rest of the game. The game can request that a specific track starts playing in the very near future, but it isn't responsible for processing in tandem with the sample rate of the track.

This is by far the shortest chapter within the "engine development" section. I'm genuinely grateful for how easy it was to add audio playback into the engine. I had never actually developed an engine capable of playing audio at all, so this was a nice little learning experience for me. If you'd like to see the current state of the project, then check out this commit. It's a shame that I cannot visually display the audio capabilities of the engine, so you'll just have to go try it yourself.