- Game Audio Paper
“Game Audio Rules”
Leeds Metropolitan University
5 Earls Court, Doncaster
In this paper the author will present a critical evaluation of the unique challenges presented by interactive audio.
The aim of this paper is to give the reader an understanding of interactive audio. With an overview of the challenges, issues, technologies and roles within the discipline of interactive audio. There is an addition folder with video examples that accompany this paper.
The term “interactive” has been widely discussed within games. There are some theorists who could argue that.
“a video game cannot be interactive because it cannot anticipate the actions of its players.” 
However for the purposes of this paper the author will be using the term interactive in the way it is used with in the games industry and defined by such Practitioners as Karen Collins (2008)
In order to fully understand what interactive audio is. We must first, understand the roles in which audio plays within a video game. In addition to the terminology used within the industry.
So what is game audio?
In the article IEZA: A framework for game audio Saunder Huiberts and Richard van Tol (2008)  define game audio as: -
“Interactive computer game play…the term game audio also applies to sound during certain non-interactive parts of the game — for instance the introduction movie and cutscenes.”
In addition to this description, the article also highlights two points that the authors believes sum up what game audio should achieve.
1 Making the game play experience intense and thrilling.
2 Helping the player play the game by providing necessary game play information.
The first statement is backed up with 2 examples  from the game Call of Duty black ops. The first example (example 1) shows just the game play action with no audio. The experience that is felt when comparing to the second example (example 2) is different in the fact that, by adding audio to the second example, the action becomes more intense and thrilling.
An example of the second statement can be seen in (example 3)  the game splinter cell double agent. As the player gets closer to the guards we can hear them before we see them, making the player aware that you are approaching them and may need to take appropriate action. In addition to this the sound feedback information that the player receives over the radio is also important to achieving the goals as well as making the player aware of any danger.
There are many sounds that make up a games audio track. Also there are many researchers that have tried to explain and give an understanding of how these sounds fit within a game.
K, Collins. (n.d.)  Suggest a broad categorization for game audio could lie under the terms: -
-Diegetic – This refers to sounds coming from the real world environment.
-Non – diegetic – This refers to sounds such as music, stabs, and sound effects not associated with the real world.
It is also suggested that this can be further categorized into dynamic and non-dynamic sound before breaking this down even further, into the types of dynamic sounds related to the activity of the player.
This gives a brief understanding of game audio, however for more in-depth analysis of the types of game audio. We must look at other researchers that have tried to define this area.
Grimshaw & Schott (2007)  propose an expansion on the terms diegetic and non-diegetic. By breaking these categories down even further by introducing the following terms.
-Telediegetic – “sounds produced by other players but having consequences for one player”
-Ideodiegetic – “immediate sounds, which the player hears”
The section Ideodiegetic again can also be categorized in to two further sub sections: -
-Kinediegetic- “sounds triggered by the players actions”
-Exodiegtic – “all other Ideodiegetic sounds”
Grimshaw & Schott (2007)
Although the above explanations describe the types of audio within games, these explanations can become overly complicated. Huiberts & van Tol (2008)  created an IEZA (Interface, Effect, Zone, Affect.) framework to show the types of audio in games. (fig. 1)
- This framework gives a good understanding of how the audio links within a game.
Under the heading diegetic, Zone and effect. Have been sub sectioned.
Zone, refers to noises such wind, rain and ambience noises that represent the real world or in game environment. This is often describes by audio designers as ambient or environmental sounds.
Effect, refers to sounds linked to diegetic parts of the game either on screen or off screen. An example of this could be footstep sounds or breathing.
Non- diegetic sound, again this is split into two sections. Affect and interface.
Interface, refers to non-diegetic sounds outside of the game world environment such as health level or score.
Affect refers to sounds linked with non-diegetic parts of the environment. Such as music designed for a specific target audience such as a horror game may use a dramatic spooky score.
There are many issues relating to game audio, in this next section the author will try to outline the numerous issues that could be raised when designing and implementing game audio. In addition to any possible salutations that are currently being used to work around such problems.
The main issue audio designer have for games is the need to fit the majority, if not all the audio into less then 10% of the available RAM.
It is suggested that the audio designer must compromise one of the following: quality, memory or variation, in order to achieve a good outcome.
It is suggested that when designing audio for games
“You can have lots of variation with good-quality sounds, but this will take up lots of memory”
“You can have good quality sounds that will fit into memory, but there won’t be much variation as we can’t fit many of them in”
“You can have lots of variation within small amounts of memory, but the quality of our sounds would have to go down”
Stevens, R. & Raybould, D. (2011) 
With this in mind it may help to look more in-depth into each of these areas.
3.1 Sample rates/file sizes
Audio designers will, the majority of the time. Decrease the sample rate of the file. By taking a stereo file that is 1 minute in length recorded at 44.1k, 16 bit and converting it to 11k, 16 bit will save 7.5 MB as the file sizes are 10MB and 2.5MB respectively.
The problem however with this method is the decreased sample rate will effect the sound quality this may not be such a problem when working on a mobile platform with small speakers. But when working on a larger platform when the player could have a very expensive and impressive surround sound set up could pose a problem. Because who would want to spend all that money just to hear a tinni sound coming out of high quality speakers!
3.2 File types / Compression
Therefore another widely used method is to change the file type. This way the designer can keep the file at the highest possible sample rate. Then depending on the file type. i.e. Mp3, OGG, .wav. Etc. This will make the file size smaller due to compression rates of said file types.
Brad Meyer  backs this up in a 2011 article in gamasutra.com.
“Perhaps the single most revolutionary breakthrough in digital audio in the past 20 years was the invention of high compression, high quality codec’s. Being able to compress a PCM file to one tenth or one fifteenth its normal size”
Meyer, B. (2007)
One problem that may occur with this method could be the fact that the engine and platform that you may be working on, might not handle certain file types.
If the platform does allow this, then there may be other difficulties such as the audio team not having a dedicated audio programmer to write the additional code to allow for such changes. Moreover this may not be a problem in a large development teams as the audio programmer may just tweak the engine to allow this. However when working in smaller teams with no dedicated audio programmer this may prove to be more challenging.
One way in which audio designers achieve variation in games is to break a large sound down into multiple smaller files then by adding parameters such as randomization and concatenation you can achieve much more variation. Rather then having to record the sound, multiple times with a different outcome.
Shows one method of achieving variation. Each sound has been down sampled to achieve the lowest sample rate possible but still high enough quality to achieve realistic results. These sounds are then split into separate groups, assigned randomly to play back within that group before being concatenated with the remainder of the groups. Therefore one sound will be chosen from each group to play back in a certain order. Variation can also be achieved if we take away the concatenate, leaving just the random.
4. Non-repetitive design
There is an increasing need for non-repetitive audio design in today’s games, as levels and maps become larger the player’s urge and need to explore these levels, as well as the time spent in the environment is increased. Or simply the player may play the same section of the game over and over.
For the player to be immersed in the game, audio designers must find ways to make the environment sound as realistic and as dynamically changing as possible. One way to achieve this is to create non-repetitive audio design.
5. Sound Propagation
As sound travels through the air as a linear wave, this wave then forces the air molecules to vibrate allowing the sound to travel. However, when these vibrating molecules react with different surfaces, such as the way in which it reflects and diffracts. In addition to the way it fades over time and distance, is known as sound propagation. Therefore in summary, sound propagation is the term used to describe the way in which sound reacts in and around different environments.
This is also backed by Parfit, D. (2005)  suggesting.
“Sound propagation characteristics include attenuation due to distance and air, reflection, obstruction and occlusion”
Therefore to have a greater understanding of sound propagation we must look at these aforementioned characterizations.
Fig.3 shows a representation of how attenuation works with a game engine for the purposes of this explanation the engine in question will be UDK (Unreal Development Kit).
A - Sound source
B – Maximum loudness of the sound
C – Minimum loudness of the sound
D – Listener
Therefore as the listener (D) moves closer to the sound source and enters the area around section C the sound will be heard at a low level as the listener then moves closer to the sound source the volume will increase until the listener is at position B where the volume will be at its maximum. As this process is reversed and the listener moves from B to C the volume will then attenuate, decreasing the volume.
Fig.4 shows the reflections of a square room, these reflections are important for the player to have a realistic sense of the environment. In the above figure 4 the direct signal will be unprocessed, attenuation may be added depending on the distance. However it is the reflection of the sounds that is important in this example. Reflection 1, will reverberate from two walls before reaching the listener, therefore the sound will also be quieter then the direct signal as it has had further distance to travel as well as some of the energy being lost. Reflection 2 will reverberate from one wall meaning the sound will be louder then reflection 1 in addition to reaching the listener before reflection 1 but quieter then the direct signal. This all adds to the overall reverb of the environment.
Many game engines have reverb presets this is because calculating reverb in real-time can be process heavy on the CPU. These presets can be used with good results depending on the game. However the sound designer may be using the same reverb preset multiple times for slightly different environments and this may start to become too familiar to the listener.
Fig.5 shows the sound source “A” with a listener “B” separated by a wall with open spaces at either side. When sound travels between points A and B the sound will be obstructed. Therefore sound between the obstructions will be unfiltered, however sounds behind the obstructions should be filtered and attenuated depending on the surface type, texture and size.
Fig.6 shows the listener “A” and the sound source “B” with a solid wall between the two points. Therefore direct sound will not be heard. Depending on the thickness of the wall, any sound that passes through this wall will need to be filtered and attenuated.
Fig.7 shows a sound source “A” and a listener “B” separated by a wall with a small open window. Ignoring the fact that the listener is stood in the same room. And imaging the two sources were in separate rooms. The direct sound will be clear, however the reflections of these sounds will be muffled due to the geometry of the wall.
6. Subjective sound and mixing
To make a game sound aesthetically pleasing, just like a record that has been mixed and mastered the final mix of a game must be mixed to high standard. However unlike mixing for the music industry that applies to the red book standard, as yet there is no final standard of quality that the games industry must abide by. In addition to this there are many more variables that the audio designer must address when mixing for games some of which are highlighted in the section below.
In a 2010 article by Rob Bridgett  for gamasutra.com. Mellroth, K. (2010) discuses the impotence of mixing and suggests “radius is as, or more important than, volume modifiers”
6.2 Dynamic range
The article also discuses the dynamic range by figuring out the loudest and quietest sounds then all other sounds are ranked between the two. An example of which is given in the game Fable II. Mellroth, K. (2010)  states that the Troll slam attack is the loudest sound heard therefore even the gunshot should be quieter.
6.3 Software Tools
Each audio engine will have its own structure for mixing; an example of this is with Fmod, this software has been used to mix games such as Heavenly sword (2007) and Little Big Planet (2008)
With Fmod the audio designer has the ability to create a hierarchy structure of sounds with parent and child snapshots the child snapshots can override the parent snapshots with each bus having its volume and pitch values altered in real-time.
Wwise also offers a similar mixing hierarchy with the addition of passive mixing, which is achieved by effects such as peak limiting and auto-ducking. In addition to the active system, which works like Fmods system, described previously.
6.4 Emotional response vs. realism
The emotional response, that the player receives from the game, can be determined by the audio designer during the mix stage. This is an integral part of the overall sound of the game.
If the audio designer decides to duck a certain sound, increase the ambience or footsteps or cut them out completely, each mix will achieve a different emotional response from the player. This is what the audio designer must get right in order to achieve the best outcome for the game.
So should the player hear someone else reloading?
Mellroth, K. (2010)  was asked a similar question in a gamasutra.com article and stated.
“We wrestle with these questions and sometimes make compromises”
It could be suggested that the needs of the game i.e. the overall experience of the sound, and emotional feedback could out way the realism of a natural soundscape.
Writing and implementing music for games is a very complex procedure.
Depending on the developer or desired outcome of the game. The composer may be responsible for composing the music to fit the action or, the audio designer may be responsible for the choice of implementation, so that the music will transition well with the action. Whether it is arranging to bars and beats or cross fading and layering, both the composer and audio designer must understand each other’s work to be able to convincingly pull off such a complex concept.
One commonly used technique used to implement music is to use vertical or horizontal layering. These layers will call up each individual stem when needed allowing for various mixes. Below is an example of a layered music system within the UDK engine.
Fig.8 shows a simple vertically layered music system. As the player triggers the first layer all other layers will also play, keeping the music in time. This allows the player to switch between any instruments within the track, depending on the actions taken within the game. This technique can be used in many audio engines such as Fmod, Wwise and UDK.
The term “algorithms” on its own is meaningless to audio designers and is often thought of as a set of programmatic instructions to achieve an output value in as few steps as possible. However Farnell (2007)  suggests that algorithmic sound is not interested in the final value but the steps in which it takes to achieve the final result (pp.4)
“Algorithmic sound tends to refer to the data, like that from a sequencer”
This could be suggested that algorithmic audio could be used as another tool to achieve game audio design. Farnell sums this up by introducing the term procedural audio.
“Procedural audio is non-linear, often synthetic sound, created in real time according to a set of programmatic rules and live input”
8.1 So what is procedural audio?
Veneri, O. et al (n.d.)  give an insight into the types of sound that is achievable by using procedural audio. This is broken down into two categorize. Firstly procedural sound, this consists of synthesizing sounds such as solid objects, water, air, fire, footsteps and birds sounds.
The second category described by Veneri, O. et al (n.d.) is procedural music an example of this technique can be seen in the video game spore (2008) . This game uses a specially developed in-house tool that was replicated from the open source project tool Pure Data (PD).
Procedural audio has its advantages and disadvantages these must be weighed up before deciding to use this technique when designing audio for games.
Firstly, here we look at the advantages of using procedural audio. Using procedural audio can offer a high dynamic and flexible will generate sounds in real time with unlimited variety. Unlike a sampled piece of audio that plays back the same sound over and over. Procedural audio can have real-time parameters applied. Thus decreasing the chance of repetition. Also increasing the level of detail that can be applied to a sound as more parameters can be adjusted by using procedural audio.
The disadvantages of using procedural audio could out way the positives, as there is an increase in CPU cost. In addition to this procedural audio can be time consuming to implement. Also not all sounds could be suitable to use, as more complex sounds are harder to synthesis and to make sound, natural. Procedural audio also needs specialist audio engines to work and not all developers will have the resources available.
9. Looking forward
It is extremely difficult to predict the future of game audio. It is suggested that model synthesis will become increasingly more abundant within games for example the use of procedural audio and systems such as Wwise wind and Wwise woosh. This will allow games to have more variety without the need of recording addition sounds, freeing up the valuable space needed.
However with the increase of RAM, hardware, disk availability and more recently cloud gaming. Some developers may choose to increase the allocation size available to game audio. Allowing higher sample rates which in turn delivers a cleaner more dynamic sound to the listener.
One recent technique that is currently being applied to the battlefield bad company titles is the use of High Dynamic Range (HDR) mixing system . This system uses a window with a set minimum and maximum output value that constantly moves filtering out any sounds below the minimum threshold.
The above techniques will vary dependent on areas such as the platform and genre, the developer and desired outcome. We will however, possibly see an increased use in all the above techniques being applied to future games.
 Collins, K. (2008) Game Sound: An Introduction to the History, Theory and Practice of Video Game Music and Sound Design. London MIT Press pp. 3
 Huiberts, S. & Van Tol, R. (2008) IEZA: A framework for Game Audio<http://www.gamasutra.com/view/feature/3509/ieza_a_framework_for_game_audio.phphttp://www.gamasutra.com/view/feature/3509/ieza_a_framework_for_game_audio.php> [Accessed 15 th November 2011]
 Huiberts, S. (2010) Captivating sound: The role of audio for immersion in video games.
 Collins, K. (n.d) An Introduction to the Participatory and Non-Linear Aspects of Video Game Audio.
 Grimshaw, M. & Schott, G. (n.d.) Situating Gaming as a Sonic Experience: The acoustic ecology of First-Person Shooters pp.467
 Stevens, R. & Raybould, D. (2011) The Game Audio Tutorial A Practical Guide to Sound and Music for Interactive Games. London, Focal Press pp.36.
 Meyer, B. (2011) AAA-Lite Audio: Design Challenges And Methodologies To Bring Console-Quality Audio To Browser And Mobile Games. <http://www.gamasutra.com/view/feature/6393/aaalite_audio_design_challenges_.php?page=2www.gamasutra.com/view/feature/6393/aaalite_audio_design_challenges_.php?page=2> [Accessed 15 th November 2011]
 Parfit, D. (2005) Interactive Sound, Environment, and Music Design for a 3D immersive Video Game. 17 August, New York pp.4
 Splinter Cell Double Agent PC Gameplay Mission 7 Cozumel - Cruiseship Part 1 of 2 (2011) uploaded by engelbrutus <http://www.youtube.com/watch?v=wqGZ61-mtL8&feature=relatedhttp://www.youtube.com/watch?v=wqGZ61-mtL8&feature=related [Accessed 20 th November 2011]
 Heavenly Sword (2007) Ninja Theory, Sony Computer entertainment Inc.
 Little Big Planet (2008) Media Molecule, Sony Computer entertainment Inc.
 Spore (2008) Maxis, Electronic arts.
 Call of Duty Black Ops (2010) Treyarch, Activision.
 Bridgett, R. (2010) The Game Audio Mixing Revolution <http://www.gamasutra.com/view/feature/4055/the_game_audio_mixing_revolution.php?print=1> [Accessed 20 th November 2011]
 Farnell, A. (2007) An Introduction to procedural audio and its application in computer games. 23 September .Veneri,
 O., Gros, S. & Natkin, S (n.d.) Procedural Audio using GAF
 Miguel, I. (2010) Battlefield Bad Company 2 – Exclusive Interview with Audio Director Stefan Strandberg<http://designingsound.org/2010/03/battlefield-bad-company-2-exclusive-interview-with-audio-director-stefan-strandberg/http://designingsound.org/2010/03/battlefield-bad-company-2-exclusive-interview-with-audio-director-stefan-strandberg/> [Accessed November 21 st 2011]