<< Newer Article #53 Older >>

Fixing System 16B sound, part one

So I ended up going back & forth a few times this weekend trying to figure out how to make the UPD7759 chip work. The big problem is that the chip runs in two modes. The first mode is "standalone" mode, where the chip is directly connected to an external ROM. A sample number is written to the chip, and the /START line toggled. At that point the chip takes over entirely, looking up the sample address in a table at the start of the ROM, and then playing back the sample data. Most games use this mode, and this works quite nicely.

Unfortunately, things are not so simple with Sega System 16B, because they use "slave" mode, which is the other mode the chip supports. In this mode, there is no external ROM directly connected to the chip. Instead, all the sample data is fed by the Z80 controlling the sound chip, one byte at a time. The key to this whole thing working is the /DRQ ("data request") signal that the UPD7759 uses to indicate that it needs more data from the Z80. In the case of System 16B, the /DRQ is connected to the /NMI signal on the Z80, meaning that an interrupt is generated. The interrupt handler then looks up the next byte in the banked sample ROM and feeds it to the UPD7759.

The current code in MAME handles slave mode with a big hack. Essentially, when a sample is started, the code cranks up a timer that requests data at a fixed rate of 40,000 times per second. This is way faster than is needed for sample playback, so the accumulated data is kept in a buffer until it is needed.

Recently, Jarek Burczynski (who is responsible for a ton of very detailed work on many sound chips emulated in MAME), set about figuring out the correct timings of the /DRQ line in slave mode. This information essentially proved the theory that there is no internal buffer on the chip, and the /DRQ line is asserted "on demand" as data is needed. This is important for the System 16B emulation case because the NMI handlers in the various sound programs take varying amounts of time. On many games, the NMI handler is very short, and can easily repond to 40,000 requests per second. On other games (Cotton in particular), the NMI handler is long, and you quickly overrun the processor with interrupts.

So, how do you implement this in MAME's architecture? Unfortunately in MAME, the sound emulation is a second-class citizen compared to the CPU emulation. Without going into excruciating detail, the upshot is that if a CPU needs to signal something to a sound chip (say, by writing data to it), then we stop executing the CPU and catch the sound chip up to the current time before actually sending the signal. This works very well to achieve highly accurate sound playback. The problem is that if the sound chip needs to send a signal back to a CPU (for example, changing the state of the /DRQ line), there is no mechanism for that to happen. The only way to work around this is to know in advance when the next signalling time will be, and schedule it using a timer.

Thus on the one hand we have standalone mode, where everything is done internally and no communication back to the CPU is necessary. This case should run like a regular sound chip. And then we have the slave mode case, where we need to signal /DRQ very frequently.

The solution I came to was to rewrite the way the UPD7759 worked internally to be entirely driven by a state machine. This state machine is implemented in a common routine that is called one of two ways. In standalone mode, the sound generation code simply counts how many UPD7759 clocks happen each sample it generates, and then, when it's time to transition to the next state, it calls the state machine routine to advance things along.

In slave mode, however, things work quite differently. When a sample is started, a timer is set up to fire immediately. That timer calls the state machine routine to advance the state, after which it looks to see how long we need to stay in that particular state. Based on the number of clocks it needs to stay there, it schedules another timer to fire at that exact moment. In the meantime, the actual sound is generated the same way it is for the standalone case, except that the sound generator doesn't count clocks (since all the timing is being handled in the timers).

In short, in standlone mode the timing and sound generation are handled by the same routine. In slave mode, the timing is handled by timers (which allows correctly timed communications with the sound CPU), and the sound generation is handled by the same routine as in standalone mode.

Fortunately, this all works quite well. Unfortunately, that was only part of the problem with 16B sound. More on that later.