<< Newer Article #73 Older >>

Speed and the new sound system

Some folks have been asking how the sound system rewrite will affect speed. The answer is, yes, it will make things a bit slower. Probably not substantially enough to make a difference unless you're right on the edge, performance-wise. To understand why things will be slower, let's take a look at what happens with, say, Pac-Man sound in MAME today, versus what it will be in the future.

Today, the sequence of events for generating a chunk of sound in MAME looks like this:


  1. The Namco sound core generates samples at its internal frequency into a stream buffer.

  2. The mixer core resamples the stream buffer, applies gain, and sums the results into a mixer buffer.

  3. The mixer core finally clamps the mixer buffer to 16-bit dynamic range and combines the left/right buffers into a single stream.

  4. That stream is fed to the OSD layer.

In the new system, because of some abstraction layers I've introduced, the steps are a bit longer. Conceptually, sound flows in a graph from the sound cores through optional filters and into a final mixing stage. The mixer is no longer an integral component; rather, it acts as a filter with multiple inputs and a single output. The graph for Pac-Man is simple:

   Namco sound -> mixer filter -> "mono" speaker

Since there is only one speaker on Pac-Man, there is only a single final output. Let's look at what happens internally:


  1. The Namco sound core generates samples at its internal frequency into a stream buffer.

  2. The stream system notices that the stream buffer from the Namco core is necessary for the mono speaker output. In response, it resamples the stream buffer and applies a gain, placing the result in a resampling buffer.

  3. The mixer filter adds all of its inputs togther (using the resampling buffer data) and generates its output in a stream buffer.

  4. The sound core system then loops over all the speakers (one in this case), takes the stream buffer from the associated mixer, and adds the contents to either the left channel, right channel, or both channels, depending on the position of the speaker.

  5. The sound core finally clamps the mixer buffer to 16-bit dynamic range and combines the left/right buffers into a single stream.

  6. That stream is fed to the OSD layer.

Now, there are some basic optimizations that can be done. In step (2), if the gain is 1.0, and the input sampling rate matches the sampling rate that is needed by the mixer, then we don't need to resample at all, so that can potentially save us a step.

Step (4) also will eventually go away, and we will hand each speaker's stream to the OSD layer so that it can be output with appropriate 3D positioning. I don't plan to do this anytime in the near future, but it's a straightforward extension of the system I am putting together.

A couple of other things I am doing will slow down the system a bit. For one thing, I am getting rid of "mono" sound output. If a game is mono, it will output identical left/right data streams, but there is little point in keeping the added complexity of supporting mono output at the OSD layer.

The other slowdown is that when the input sample rate is higher than the output sample rate at a resampling point, I intend to do a more accurate "sum of energy" calculation instead of a simple linear interpolation or even simpler crude resampling. This should help reduce aliasing effects when the sound cores operate at a higher frequency than your sound card, but it will be more expensive.

One additional slowdown I am considering is using 32-bit intermediate buffers for all the sound mixing. This allows us to more easily "overdrive" the sound and downmix into 16-bit only at the end, rather than clipping at each stage. Accessing the extra memory will cost at the cache level, but on the plus side, we can get rid of a lot of intermediate code that does clipping along the way and just do it in the final step.

Next post I'll talk a little bit about filters and how they fit into the mix, so to speak.