mirror of
https://github.com/dolphin-emu/dolphin.git
synced 2025-12-16 04:09:39 +00:00
Instead of doing vector operations and throwing away the top 64bits of each operation, let's instead use scalar operations. On Cortex-A57 this saves us three cycles per vector operation changed to scalar, so this saves 3-9cycles per instruction emulated. Also puts one less micro-op in to the vector pipeline there. On the Nvidia Denver I couldn't see any noticeable performance difference, but it's a quirky architecture so it may be noticing we are throwing away the top bits anyway and optimizing it. The world may never know what's truly happening there. |
||
|---|---|---|
| .. | ||
| AudioCommon | ||
| Common | ||
| Core | ||
| DiscIO | ||
| DolphinQt | ||
| DolphinWX | ||
| InputCommon | ||
| UICommon | ||
| VideoBackends | ||
| VideoCommon | ||
| CMakeLists.txt | ||