<< Newer Article #138 Older >>

CPU Scheduling in MAME (part 2)

Part 1 of this article described the basic round robin scheduling algorithm used by MAME when running multiple CPUs. It can be summarized as:


  1. Determine when the next timer is going to fire

  2. Loop over each CPU:

    1. Compute the difference between this time and the CPU's local time

    2. Execute for at least that long

    3. Count up the cycles executed and compute the CPU's local time


  3. Return to step 1

Now this is all fine and great when you can predict the future — notice that step 1 requires you to do just that. And many events in MAME are in fact periodic and predictable in nature, so this works out ok as a starting point.

One very common event that isn't predictable, however, is communication between two CPUs. In fact, this is one of the main reasons I added the timer system to MAME in the first place. Let's look at an example of what can go wrong with CPU communication when there is no synchronization.

Going back to our previous example, we have CPU #0 running at 14MHz, and CPU #1 running at 2MHz. We also know there is a timer that is going to fire in 150usecs, or at time 0.000150. So we begin executing CPU #0 for 2100 cycles. This time, however, the code executing on CPU #0 decides to signal an interrupt to CPU #1 at a point 1500 cycles into its execution (at time = 1500 / 14,000,000 = 0.000107143). With no synchronization, the interrupt is signalled (generally some state is set in CPU #1's context indicating that an interrupt is pending), and CPU #0 continues executing until the end of its timeslice. When it's done, it has executed 2112 cycles like before, and its local time is 0.000150857.

Now it's time to execute CPU #1 for the first time. Right away, however, it notices that it has an interrupt pending, so it may process the interrupt, and then continue executing for the rest of its timeslice normally.

So what's the problem? Well, CPU #0 signalled the interrupt at local time = 0.000107143, but CPU #1 acted on the interrupt at local time = 0 (right at the start of its timeslice). Let's say CPU #1 is a sound CPU; maybe the music started just a fraction of a second too early as a result. Worse, let's say CPU #1 was busy doing something else and wasn't ready to handle the interrupt just then. Maybe you've crashed the code or altered its state in a bad way.

What might not be immediately obvious in looking at the timer system is that most of the timers that are created are set to fire instantly. If you've even seen timer_set(TIME_NOW, ...) in MAME, you've seen a request for a instant timer. This is done because timers are effectively synchronization barriers. The whole scheduling algorithm is based upon executing all the CPUs until the time when the next timer is set to fire. Setting up an instant timer is essentially asking the scheduler and the timer system to work together to bring all the CPUs up to the current instant in time before calling the function you provide it.

How does this help the interrupt signalling issue described above? Let's say that when CPU #0 decides to signal an interrupt to CPU #1, it doesn't signal it immediately. Rather, it sets an instant timer to go off. Since the current time = 0.000107143, that is when the timer is scheduled to fire. But timers don't fire until the end of the round robin sequence, and then only after all CPUs have reached that time. So even if we were at the end of the round robin, it still wouldn't be able to fire just yet because CPU #1 is still sitting back at time = 0.

We have another issue here as well. CPU #0 has just set a timer to go off at time = 0.000107143, but it also still has 600 cycles left to execute before the end of its timeslice. We could potentially allow it to complete its run, ending up at time = 0.000150857. But then when we executed CPU #1, we would only execute up until the time the timer was set to go off (at 0.000107143), and the two CPUs would be significantly out of sync. Rather than letting that happen, whenever a new timer is created during a timeslice, and it is scheduled to fire before that timeslice is up, the timer system and the scheduler work together to abort the execution of that CPU. Generally, this means that the CPU will stop executing at the end of the current instruction, and return control back to the scheduler.

In this case, since CPU #0 set the timer during its timeslice, and since the timer is scheduled to fire during the timeslice, execution on CPU #0 is aborted with 600 cycles left to go. The scheduler knows that CPU #0 only executed 1500 of its requested 2100 cycles, and updates the local time of CPU #0 to be 1500 / 14,000,000 = 0.000107143.

Now CPU #1 gets its chance to execute. 0.000107143 * 2,000,000 = 215 cycles, so we execute CPU #1 for that long. When it is finished, maybe it actually executed 217 cycles, so its local time is 0.0001085.

The global time is updated to the minimum of the CPU times, in this case 0.000107143, and the timer system is asked to process timers. At this point, the callback for the instant timer we set is called, and in that callback is where we signal the interrupt to CPU #1.

Returning to the scheduler, we look into the future and see that there is still a timer set to go off at time = 0.000150, so we compute cycles for CPU #0 (600) and execute. After CPU #0 is done, we switch to CPU #1 and start executing. CPU #1 now has an interrupt pending, but it has been signalled at the correct local time (or close to it, at 0.0001085), and synchronization between the two CPUs has been achieved.

What happens when CPU #1 tries to talk back to CPU #0? Find out in part 3, which will deal with this complex issue.