Audio System

General Notes

The Corona Audio system gives you access to advanced OpenAL features. The older Corona Event Sound system will eventually be deprecated. Please read this section for important information on Android limitations, file formats, and performance tips.

Maximum number of simultaneous channels

The maximum number of channels is currently 32. This means you can play up to 32 distinct sounds simultaneously. There is an API to get this number.

 
    audio.totalChannels
  

Best Effort Timing

This audio system is a best effort system. You tell us to play, and we try to play as quickly as possible. There is no guarantee things will start or end at exact times. For example, if you are streaming a song, and there is a buffer underrun due to stress on the CPU, it tries to continue playing as soon as it can. If there was a 1 second delay because of this, the song will end 1 second later than it would have if there was no interruption.

Streams vs. Sounds

Notice that there is no more "event sound" vs. non-event-sound distinction in the new audio API and there are no longer two distinct audio APIs for handling each. Instead the only API difference is in how you load the sound. loadSound( ) preloads an entire sound into memory, while loadStream( ) prepares a sound to be played by reading small chunks at a time to save memory. All the new audio APIs can be applied to files loaded either way. However, there are some subtle side-effects and in a few cases, the differences cannot be made completely transparent.

Universal file formats (Supported Formats no-longer dictated by "event"/"non-event")

Before, different formats were supported depending if you used event sound or not. This restriction has been eliminated. audio.loadSound( ) and audio.loadStream( ) support exactly the same formats.

Shared/multiple simultaneous playback

Sounds loaded via loadSound( ) can be played back simultaneously on multiple channels. So for example, you load an explosion sound effect via:

  explosionSound = audio.loadSound("explosion.wav")
  

Now it happens that you need to make multiple things explode in your game, enemy ships, asteroids, missiles, etc. Our audio engine is highly optimized to handle this case. Just call audio.play() using that same handle as many times as you need it (up to the max channels), e.g.

  audio.play(explosionSound)
  audio.play(explosionSound)
  audio.play(explosionSound)
  audio.play(explosionSound)
  audio.play(explosionSound)

Don't load multiple instances of "explosion.wav" via loadSound. It is a waste of memory.

But for audio.loadStream( ), the handle cannot be played across multiple channels simultaneously. A single stream handle can only play on one channel at a time. So if you need to play multiple simultaneous instances of the same file, you should load multiple instances of it and use them separately, e.g.

  musicHandle1 = audio.loadStream("music.wav")
  musicHandle2 = audio.loadStream("music.wav")
  

Since streams try not to use much memory, this shouldn't be a serious concern. Typically needing to play the same stream on multiple channels at the same time doesn't come up often.

Seeking and Rewinding

Again, there are subtle differences between loadSound and loadStream. See the API docs about this.

Latency

Sounds loaded with loadSound will have the lowest latency. They will play as quickly as possible. Conversely, sounds loaded with loadStream need to start decoding the first chunks so there may be a higher latency. We try to minimize this latency as much as possible, but typically, for things that need low latency times, like sound effects in a game, these should always be loaded with loadSound.

Memory management

Currently, you are responsible for calling audio.dispose() on your loaded samples when you are completely done with them if you want to recover the memory. If you plan to use the sounds for the life of the program, you probably don't need to worry about this.

Performance tips

Pre-loading phase

Pre-loading all your files in some kind of startup phase is strongly recommended. Both loadSound( ) and loadStream( ) block. While loadStream( ) is generally fast, loadSound( ) may take awhile since it must load and decode the entire file right there. Generally you don't want to be calling loadSound( ) in the parts of your app that users expect it to be the most responsive (e.g. don't call this in the middle of gameplay).

Unloading phase

Conversely, you should have a close-down phase to unload all the files you loaded. For apps that use the sounds throughout the life of the program, you can skip this. But imagine a game with multiple levels/stages. When the player gets to stage 2, you might want to unload all the stage 1 sounds so you have memory to load the stage 2 sounds. (Stage loading is a great time to preload the next batch of sounds too.)

audioPlayFrequency

In your config.lua file, you may specify a field called audioPlayFrequency, e.g.

 
  application =
  {
      content =
      {
          width = 320,
          height = 480,
          scale = "letterbox",
          audioPlayFrequency = 22050
      },
  }
  
This is an optimization hint that tells the underlying OpenAL system what sample rate to mix and playback at. For best results, set this no higher than you actually need. So if you never need better than 22050Hz playback, set this to 22050. But if you really do need high quality, then set this to 44100. Note that this is only a hint to the audio system. The underlying audio system is free to ignore this hint value. But on Mac and iOS, this seems to be consistently respected. Also for best performance, all the files you include should be encoded at the same frequency you set this to. For example. if you set this to 22050, all your files should be encoded at this 22050 Hz.

Supported values are 11025, 22050, 44100. Other values are not tested.

Use mono sounds when possible

Mono sounds have two advantages. First, they take half the amount of memory as stereo sounds. Second, OpenAL will only apply spatialized/3D effects to mono sounds. OpenAL will not apply 3D effects stereo samples. While we currently don't officially support 3D effects in this release, this is something we plan for future releases.

linear PCM

For fastest loading/decoding time, use linear 16-bit signed integer little-endian raw PCM samples. Most commonly, .wav files use this format. (However, .wav files sometimes use other codecs.)

Hardware decoder acceleration

iOS provides a hardware decoder chip that handles certain formats, namely .mp3, .aac/.mp4/.m4a/.3gp/.3gp2, and Apple Lossless (ALAC). Technically speaking, the hardware decoder can only decode one of these files at a time. If the hardware decoder is in use, the system falls back to software, but decoding these formats is generally expensive on mobile processors.

Typically this is only a concern when playing streaming sounds (since you preload non-stream sounds in serial order.) So generally, you might encode background music as mp3 or aac, but if you need a second streaming track to play simultaneously such as speech, you might encode this one as WAV (or if iPhone only, IMA4 in a .caf container yields better compression and not much difference in CPU decoding cost over WAV). However, as an implementation detail, it turns out that most of our decoding under the hood is a serial process so this might not actually turn out to be a real worry. It is not easy to tell whether iOS fell back to software or not. Your only clue is a significant spike in CPU utilization. So we would appreciate feedback on how this scales for you if you push the limits.

Volume (state) persistence

When you set the volume on a channel or use the fade APIs (which actually set the volume on the channel), these values remain persistent until you change them again. When using auto-assigned playback channels with audio.play(), you might run into some nasty surprises if you are not careful.

So imagine you played some music which was auto-assigned to channel 1 and you changed the volume on that channel. Now some time later, that music ends and the channel is freed. You then play a new sound. The auto-assignment picks channel 1. But you hear something at the wrong volume or worse you hear nothing because the volume on the channel was set to 0.0.

So there are several techniques for dealing with this.

First, it is often the case that you only need to set per-channel volume for specific categories like "music", "speech", "sound effects". If this is the case, I recommend using audio.reserveChannels() to block off certain channels from being auto-allocated. For instance, use audio.reserveChannels(2) to prevent channel 1 and 2 from being auto-assigned. Then always play music on channel 1 and speech on channel 2. Then set their volumes directly using channel numbers so you don't get any surprises.

If this is not the case, then you could use the onComplete callback system as an opportunity to reset the channel volume for the channel that just finished playing.

Or before you start playing, you can use audio.findFreeChannel() to get an auto-assigned channel, set the volume, and then play on that channel.

Also be careful about using volume on channels in general because these will impact 3D effects in future releases.

Other Encoding Format tips

Patents and Royalties

For highly compressed formats, such as MP3 and AAC (aka MP4), AAC is a better option. AAC is the official successor to MP3 by the MPEG Group. MP3 technically has patent and royalty issues you may need to concern yourself with if you distribute anything yourself. Consult your lawyers for guidance. When AAC was ratified, it was agreed there would be no royalties required for distribution.

Ogg Vorbis is a royalty free and patent free format. However, this is not supported on iOS.

Cross-Platform

linear 16-bit signed little-endian .WAV works everywhere. MP3 is also supported everywhere. (Though see royalty concerns.)

Looping

Beware that certain formats (particularly highly compressed lossy formats like mp3, aac, ogg vorbis) can pad/remove samples at the end of an audio sample and potentially break a "perfect-looping" clip. If you are experiencing gaps in looping playback, try using WAV and make sure your lead-in and ending points are clean.

getDuration

The audio.getDuration may return imperfect information, particularly for files loaded via loadStream. The encoding format you use may yield different results, so if you want better information, you might try a different format. MP3s are notorious for giving bad information.

Audio Session Services for iOS

iOS has a unique set of services called Audio Session Services, which among other things, allow you to configure audio behavior policies for your application. These policy include things such as, 'should your application's audio be silenced by the Ring/Silent switch?' or 'should iPod audio continue playing when your audio starts'? A special API has been provided in Corona to allow you access to these properties if the default behaviors are not sufficient. These APIs should be considered for advanced users only. Using incorrect settings may break your application.

Please refer to this forum post for more information: Corona SDK:new audioSession properties. (TBD)

Secret/Undocumented Audio APIs

OpenAL "3D" Effects

There are a number of undocumented audio APIs that provide direct access to OpenAL 3D effects. Please refer to this blog post for more information: The secret/undocumented audio APIs in Corona SDK.

Streaming Audio Performance Tuning

Note: The following information is for user's experiencing problems with the performance of streaming audio. Altering parameters in the audio.loadStream API from the default values can cause unexpected side-effects. We strongly discourage altering the parameters of this API from their default values.

There are a few undocumened Audio APIs that were created just in case people needed to fine tune performance in case buffer under-runs were a serious problem.

The audio.loadStream API supports a table parameter with 4 keys:

audio.loadStream(file [, baseDir], {bufferSize=8192, maxQueueBuffers=12, startupBuffers=4, buffersQueuedPerUpdate=2})

(Note: The values listed here are the current defaults.)

bufferSize is the number of bytes for a single OpenAL buffer for the streaming buffer queue. This value should be a power of two. The default value is 8192.

maxQueueBuffers is the maximum number of read-ahead buffers (each of size bufferSize). The idea is that when the system is idle, we can read-ahead and fill more buffers in advance. So if a CPU heavy period comes and can't get enough time to read more buffers, we have a reserve pool we can draw from. Be warned, increasing this number eats more memory (each buffer is bufferSize + minor overhead). The number of maxQueueBuffers cannot be less than the number of startupBuffers

startupBuffers is the number of buffers to read ahead before starting play. This must be less than maxQueueBuffers. This is also of the numbers to read-ahead when playback stopped because of starvation (e.g. ran out of read-ahead buffers due to CPU load). Warning, setting this higher means you have more delay between the time you say audio.play() and when you start hearing it play.

buffersQueuedPerUpdate is the number of buffers to read-ahead every update cycle. You can think of an update cycle as once per frame, but maybe less frequent than that. The update happens on a background thread so the timing is non-deterministic. Increasing this value may mean you spend more time in the audio update pass which means you might have less time to spend doing other things, like drawing or physics. The number of buffersQueuedPerUpdate cannot be less than 1.

If you do use this API to fine-tune streaming audio in your application and find parameters that work better than the default values, please let us know by emailing us at support@anscamobile.com.