> In theory emulation should be flawless
Here are some of the restrictions on emulation in theory vs practice:
Different systems have specialized hardware, in which instructions don't translate perfectly. This leads to emulation being *much* slower than equivalent clock cycles. Depending on how similar architectures are, basic functionality can take more than 20x as long, instruction for instruction.
The easiest way to write an emulator is with a simple interpreter. It essentially runs like a person reading out of a cookbook. "Load this location" "Add these two things together". A table of what each instruction means tells the interpreter what to do in every case. However, this is extremely slow, since every instruction is both read AND executed every time it's interpreted.
The ideal emulator would translate an entire program into code that can be run on the new host. This is referred to as static recompilation. A program is taken and converted before execution to run on a new system. In practice, this type of emulation is near-impossible, and most programs would need to be compiled by hand. However, the concept led to a close approximation: Dynamic Recompilation.
A Dynamic Recompiler, or dynarec, takes segments of the code as they're executed, and compiles entire branches into machine code. It then caches these for later use. However, if something occurs to invalidate that code, it's purged and recompiled. This allows for code that is heavily reused to perform extremely well, but can cause code that frequently modifies itself to become even slower than a simple interpreter. It's a tradeoff that drastically improves performance for predictable programs, while still allowing small amounts of unpredictable behavior.
Mupen64plus is not written to be a "fast" emulator, per-se. It uses a dynarec, and HLE graphics, but a great deal of the architecture focuses on accuracy* rather than performance, given the choice. The consequences of this are an emulator with far fewer glitches, but which can be quite a bit slower. Compared to some of the much older N64 emulators, like 1964 or Corn, Mupen64plus is much slower. However, due to severe glitches, those emulators make many N64 games completely unplayable. They're essentially made from hacks stacked on hacks until a target game works.
Multithreading is basically impossible, as Dorian said, for several reasons:
* The N64 was a single-threaded platform.
* The GPU is already hardware-accelerated, so it doesn't have much independent work to share between threads as far as rendering. The heavy lifting is extremely intertwined with the dynarec, making it very difficult to separate the two.
* Making a single-threaded application multi-threaded is very complex, and often won't bring significant improvement in relatively linear applications. Emulation doesn't have a chance to build up a large buffer of events, since large parts of execution are reevaluated constantly. Even if such optimizations were put in, they would struggle to save more than the overhead they would produce.
Here's an analogy for the problem. A group of four people is reading a book. In order to reach the end fastest, they decide to split it into four sections. Four people means you'll get it done 4x as fast, in theory. However, in order for the second, third, and fourth person to understand what's happening, they have to go back and look, or wait for the first to explain to them. Unless the story has lots of independent splits, they spend most of their time looking back. In the end, they only read the book 1.1x faster, because so much waiting was involved. Plus, it took them a fair amount of time to get organized in the first place.
Here are some potential optimizations that could *potentially* work, to a noticeable degree.
* RPi-optimized Graphics Plugin
* ARM7 specific dynarec optimizations. (That is, assuming they're not already there. The core has some, but the RPi package people are using might be outdated. I don't keep up on such things)
Practically speaking, though, you shouldn't expect particularly good performance on the RPi, especially at higher resolutions.
*Note: I am not trying to claim that Mupen64plus is a cycle-accurate emulator like BSNES or Cen64. I'm only asserting that it's a more accurate emulator than the others listed, and lower performance as a result.
I hope this helps clear some things up.
William Shipley