Implementing a NES Emulator in Rust

kouteiheika · on March 19, 2019

Shameless plug - for anyone that's interested, I also wrote a NES emulator in Rust a few years back: https://github.com/koute/pinky

I even compiled it to WebAssembly: https://koute.github.io/pinky-web/

> The CPU address space has several PPU registers mapped. So the CPU maintains a permanent mutable reference to the PPU. But the top-level Nes object also owns the PPU. I worked around this using Box to assign fixed memory addresses to values, and then “unsafe” pointer dereferences when needed.

There's actually a neat little trick which I used in my emulator which allows you to fulfill Rust's single ownership rule while simultaneously keeping the subcomponents of the emulator independent, making it possible to interconnect them without any unsafe and letting the compiler to optimize the whole thing as everything is statically dispatched.

Basically simplifying a little bit for each component (CPU, PPU, APU, etc.):

1. You put the whole state of the component into a separate `State` structure.

2. You create a `Context` trait which has a `get_state`/`get_state_mut` method as well as various callbacks needed by the component itself (e.g. PPU's `Context` has a `peek_video_memory` so that it can read video memory which is stored externally).

3. You create an `Interface` trait (with `Context` as supertrait) which contains the public interface of a given component (e.g. PPU has `execute`, `peek_ppustatus`, etc.).

4. You put the whole implementation of the component inside of a `Private` trait (with `Context` as supertrait).

5. You do a blanket impl of `Interface` and `Private` for any `T` which implements `Context`.

So then you create a single `Nes` structure which has `cpu::State`, `ppu::State`, etc. inside of it as fields, and implements the `cpu::Context`, `ppu::Context`, etc. traits. Inside of those impls you just wire the various subcomponents together, and voila, every component is encapsulated from each other and yet they can talk with each other, and there is no indirection anywhere (no `Box`es, no `RefCell`s, etc.).

hathawsh · on March 19, 2019

I'm really enjoying your 6502 emulation code. It feels clean and easy to read.

https://github.com/koute/pinky/blob/master/mos6502/src/virtu...

(I imagine the rest is good too, but 6502 is what I know. :-) )

TheServer201 · on March 19, 2019

In the Iterators section if you use

  for (idx, x) in xs.iter().enumerate()

instead of

  for (x, idx) in xs.iter().zip(0..xs.len())

you get the 4x unroll (https://godbolt.org/z/XDyx0w).

devit · on March 19, 2019

Why not use serde instead of rolling your own "Savable"?

Also you should use Rc/Arc and Weak (along with RefCell/Mutex if needed) instead of the unsafe pointers. Even better, if possible, move the functions that access multiple components to the object that holds them all, and don't use any "smart pointers". Another possible design is to pass borrowed references explicitly to the methods, and add them to the trait signatures where needed.

Regarding memory access, it's probably fastest to have an hardcoded fast path for RAM and ROM, and then process the rest like now. Lookup tables might help as well (as long as all the data including them fits in cache, of course).

FullyFunctional · on March 20, 2019

This is very interesting. Do you have a good example to study how this should be done?

toprerules · on March 19, 2019

So what are your general thoughts on Rust? Would you use it again? How does it compare to using C or C++? Did you enjoy working in it? What was disappointing/motivating about the language?

maeln · on March 19, 2019

If you like the subject of emulator and Rust, I suggest you take a look at Ferris youtube : https://www.youtube.com/channel/UC4mpLlHn0FOekNg05yCnkzQ/vid...

He stream is development of a Nintendo 64 and Virtual Boy emulator in Rust.

codetrotter · on March 19, 2019

And here is a video [1] and free to read online book [2] that someone else made about implementing a Game Boy emulator in Rust.

[1]: RustFest Rome 2018 - Ryan Levick: Oh Boy! Creating a Game Boy Emulator in Rust - https://youtu.be/B7seNuQncvU

[2]: DMG-01: How to Emulate a Game Boy - http://blog.ryanlevick.com/DMG-01/

jolson88 · on March 19, 2019

This is amazing. I've long wanted to get into some of this stuff and lately have fallen in love with Rust. I can't wait to watch these videos :D.

powerbook5300CS · on March 19, 2019

If I recall correctly, P.C. Walton, one of the creators of rust, used an NES emulator during rust’s initial programming to measure performance of the language’s output binaries.

eloisius · on March 20, 2019

Indeed. I peeped at his code[1] frequently while I worked on a GameBoy Original emulator[2] that I've as of yes failed to finish.

[1]: https://github.com/pcwalton/sprocketnes

[2]: https://github.com/zacstewart/gbrs

sidcool · on March 19, 2019

Question, what made it easier to build this in Rust than other languages? I know the advantages of Rust, but what exactly was the experience in this context?

joshbaptiste · on March 19, 2019

Implementing an NES emulator is a great way to learn more about a language and the internals of the NES (which is well documented). It's sorta like a rite of passage and has been done in most of the "newer" languages like Nim (https://github.com/def-/nimes) , Go (https://github.com/fogleman/nes)

ofrzeta · on March 19, 2019

It's actually "rite of passage" as in latin "ritus".

joshbaptiste · on March 19, 2019

oops fixed.. thx

primo44 · on March 19, 2019

Almost: rite _of_ passage (not "to")

joshbaptiste · on March 19, 2019

argh.. fixed again thx

jerf · on March 19, 2019

You have now completed this write of passage.

oconnor663 · on March 19, 2019

To become a passage wright.

nukeop · on March 19, 2019

Well-documented, but many sources are contradictory. For such well researched hardware, there is a lot to be confused about when researching this topic by yourself.

Kye · on March 19, 2019

At least there's no shortage of implementations to reference.

bfrydl · on March 19, 2019

The author doesn't really suggest that Rust was chosen for any particular reason. A few benefits are mentioned but they aren't really specific to emulators.

ChrisRR · on March 19, 2019

Because why not?

hu3 · on March 19, 2019

From the article:

"C is probably still a better choice for those cases: It’s more difficult to find someone who can implement a Rust compiler than a C compiler.

But Rust seems usable in any project where C++ is a viable language."

xfer · on March 19, 2019

It's probably because implementing an emulator is highly rewarding and easy enough, hence makes it more exciting during learning a new language and NES is extremely well-documented.

FartyMcFarter · on March 19, 2019

It seems that unsafe references were used, so the advantages of Rust appear to have been somewhat relinquished in this case.

bionicbits · on March 19, 2019

I am not sure you can not use `unsafe` in an embedded type of project. `unsafe` isn't necessarily bad, but when used at least makes it very clear where extra attention should be paid.

nukeop · on March 19, 2019

I also wrote a NES emulator in Rust, and managed to do it without using a single unsafe block. I used SDL bindings for graphics, if that counts, but it can also run in headless mode (for testing) without SDL perfectly well.

mcluck · on March 19, 2019

Do you have a link? I'd like to see how yours compares with OPs

nukeop · on March 19, 2019

Not finished, but for what it's worth: https://github.com/nukeop/mr-cool-nes

SPD-13 · on March 19, 2019

Is this named after Mr Cool Ice?

nukeop · on March 19, 2019

you know it

simias · on March 19, 2019

Writing an emulator is not really embedded programming (the target often qualifies as an embedded environment but the host can be any random desktop computer).

It's definitely possible to write an emulator without unsafe code. Here's a toy gameboy emulator I wrote a while ago in Rust without any unsafe code: https://github.com/simias/gb-rs

Here's a very incomplete PSX emulator with only a single unsafe (which might not even be necessary anymore, I haven't updated the code in a while): https://github.com/simias/rustation

mzs · on March 19, 2019

Are you sure about that?

https://github.com/simias/gb-rs/blob/master/Cargo.toml#L29

simias · on March 20, 2019

The SDL2 dependency? I don't understand what you mean by that.

Or are you saying that a dependency that I use happens to use unsafe code? In which case it's true but then by that definition it's effectively almost impossible to write 100% safe rust since the stdlib itself contains a non-negligible amount of unsafe code.

steveklabnik · on March 20, 2019

It’s always impossible, as you have to call into the OS at some point to do anything of use.

hopler · on March 19, 2019

Emulating a system of a few 2Mhz with a 4x3Ghz chipset ought to be doable using managed memory.

flohofwoe · on March 19, 2019

Nitpicking a bit, but there's not a lot of useful things that could be multithreaded in an 8-bit emulator, so it would be more like 1x3Ghz ;)

In those old 8- and 16-bit machines all components usually ran completely synchronous for each clock tick, for instance if the video emulation runs out of sync with the CPU emulation for a tick or two, a write from the CPU to a video hardware register might not make it in time and you get garbage video output.

Modern asynchronous systems may need less (relative) host system performance for emulation, because the timing requirements between the different hardware components are much more relaxed.

jrockway · on March 19, 2019

The original hardware was multithreaded... in the sense that there were several separate processing units.

flohofwoe · on March 19, 2019

Yes, but as far as I can glance from the NES spec, those still run from the same clock (with different dividers, but they're still locked to each other).

I haven't written an emulator for the NES yet, but on machines like the Amstrad CPC or C64, the CPU can reprogram video hardware registers (like color palette entries) at any time, for instance in the middle of a scanline. When this is off by a tick, you get the color palette change a couple of pixels late or early.

If CPU and video chip emulation would run on different threads, they would need to sync with each other a few million times per second, that doesn't sound like a good idea. If synchronization is only needed once per scanline, or even per frame that's an option of course. Often this is good enough to run some games which didn't go too close to the metal (again this is only from my experience with home computer emulators, haven't done NES stuff yet).

Parallelizing through SIMD might be an option though (one can pack a lot of 4- or 8-bit counters into a 512-bit register).

boomlinde · on March 20, 2019

Yes, but their emulation needs to be performed in lockstep, normally per clock tick or at least externally visible state change, to accurately model their interaction with respect to timing.

Threading in that case would likely mean several orders of magnitude of performance degradation.

TheCoelacanth · on March 19, 2019

Except that an explicit goal of the project was to run simulations much faster than real-time to train an AI to play the game, so they still wanted to squeeze the the maximum performance out of the program.

bluejekyll · on March 19, 2019

unsafe exists in Rust for when you need it. When it is used, it means nothing more than you’re writing code for which you must ensure the safety instead of the compiler.

The times when unsafe is poorly used are when it’s being abused to ignore safety related compilation issues, like lifetimes and/or mutability. Though, even in those cases it might be ok if you’re using lower level concurrency primitives to enforce those guarantees.

unsafe does not remove all advantages of Rust, it only removes a few restrictions. It should be avoided unless necessary.

steveklabnik · on March 19, 2019

It doesn’t remove restrictions, it adds unchecked constructs. Very different!

bluejekyll · on March 19, 2019

Hm, I’m not sure I see the difference between those concepts, but I do like your framing better.

lmkg · on March 19, 2019

The main point is that if you take safe code and put it inside an unsafe block, the meaning of the code doesn't change. So for example, array indexing is bounds-checked in safe code, and non-bounds-checked array access is only available in unsafe blocks. This does not mean that the call `a[i]` loses bounds-checking in an unsafe block. It has the same semantics regardless of context. Rather, the separate call `a.get_unchecked(i)` is available inside an unsafe block and cannot be called in safe code.

steveklabnik · on March 19, 2019

What my sibling said, but also, that unsafe doesn’t mean you’re free from rules; where safe and unsafe interact, your unsafe code is expected to uphold the safety guarantees that the safe code expects.

This matters because of stuff like the article; if you see the other thread, I’m pretty sure (though haven’t verified yet) that this code has UB, even though they’re using unsafe.

bluejekyll · on March 19, 2019

Right, which is why I phrased it as a “few restrictions”, I do understand the point you’re making.

I suppose the reason I consider it restrictions is due to things like “in safe code you can not work with raw pointers”. That seems like a restriction to me, but I can see why saying that way is less accurate.

oconnor663 · on March 19, 2019

If you use unsafe Rust, you do give up the advantage of being able to say, "Look at this code, no unsafe, therefore no UB." However, the lifetime and ownership model is still doing a lot for you. You get a lot of control over the way that your safe code (hopefully most of your program) can interact with your unsafe code. For example if I have this function (partially copied from listd):

    fn reverse_slice<T>(slice: &mut [T]) {
        let len = slice.len();
        for i in 0..len / 2 {
            // Unsafe swap to avoid the bounds check in safe swap.
            unsafe {
                let pa: *mut T = slice.get_unchecked_mut(i);
                let pb: *mut T = slice.get_unchecked_mut(len - 1 - i);
                std::ptr::swap_nonoverlapping(pa, pb);
            }
        }
    }

In theory, I could use those unsafe pointers to alias the same value and cause UB. Or I could stash them somewhere that outlives the lifetime of what they're pointing to, and cause UB later on that way. So I have to audit the function carefully to avoid doing those things. But at the same time (in the absence of other broken unsafe code in the caller), this function can rely on a lot of guarantees:

- The `slice` reference is guaranteed to be unique. No other code, on any thread, has read or write access to `slice` while this function is running.

- The length of `slice` is guaranteed to be correct. All of the memory it refers to has been properly initialized, and computing pointer offsets into it cannot overflow `usize` or otherwise cause UB. (This guarantee is surprisingly subtle, because it means the maximum length of a slice is `isize::MAX` rather than `usize::MAX`.)

- The type `T` is guaranteed to be safe to move. It doesn't contain any internal references to its own memory, which moving would invalidate.

All of this put together makes it possible to audit this function by itself, to make sure safe code can't use it to trigger UB.

wyldfire · on March 19, 2019

To declare the advantages reliquinshed requires some kind of discussion of the `unsafe` blocks' scope. How many, how much code?

Regardless, I'm not even sure I'd say that memory safety is Rust's biggest advantage. It's certainly the headliner and it's critically important. But it's a really appealing package overall IMO.

Retra · on March 19, 2019

One of Rust's advantages is that you can drop into unsafe to do this sort of thing.

AntiRush · on March 19, 2019

There's also sprocketnes, which was originally written ~6 years ago by pcwalton, who works on rust/servo at Mozilla.

It's been kept up to date and it's interesting to see how the project evolved along with the language:

https://github.com/pcwalton/sprocketnes

prudhvis · on March 19, 2019

While the project itself is a great resource, the dependencies marked as "*" are unfortunate for a project that is 6 years in the making.

Bulbasaur2015 · on March 19, 2019

What was the most challenging part for you in building this emulator?

MichaelBurge · on March 19, 2019

It took 14 days to get it functionally-complete, where I could beat Mario using savestates.

The PPU took 7 days, the CPU took 5 days, the APU took 1 day, and there's a spare day for everything else.

The CPU was easy to make: Its natural unit of output is a single CPU instruction. Even if there are many details, it's a straightforward process to compare an instruction-by-instruction log with a reference log until there are no differences.

The PPU was harder because the natural unit of output is an entire frame: It's not any easier to get the first pixel correct, than to get them all correct. So almost everything needs to be in place before you can validate its output.

wiremine · on March 19, 2019

That's impressive. A few follow up questions:

1. Was this your first NES emulator?

2. Your first project in Rust?

MichaelBurge · on March 19, 2019

Yes to both of these. I usually use C for projects like these, and wanted to see how Rust compared.

codetrotter · on March 19, 2019

Follow up to that: Did you write any other emulators in the past?

cabaalis · on March 19, 2019

I want to build one myself as well, but I don't want to use existing code as a guide, rather challenge myself to implement from raw specs.

The problem is I haven't really found any yet that helped me (admittedly, I have so far devoted just one Saturday morning to it.) Are there any specs about the hardware that you used, or that anyone else can share?

kouteiheika · on March 19, 2019

> Are there any specs about the hardware that you used, or that anyone else can share?

Everything you need is either on http://wiki.nesdev.com or on their forums. I know because that's what I did with my NES emulator - I explicitly didn't want to look at any source code and instead wanted to implement everything only based on the docs.

Granted, what's on the NESdev isn't always easy to grok, up to the point of being really confusing sometimes. What I've found really helped is gradually setting up a test suite based on various test ROMs I could find, which even allows you to implement some parts of the emulator TDD-style once you get the basics up and running.

zeta0134 · on March 19, 2019

Shoutout to TASVideos here, who maintain a fairly up to date list of test ROMs for various hardware features. Once you have the emulator basics down (working well enough, say, to run Super Mario Bros, which is surprisingly demanding) you can throw blargg's tests at your emulator and start chasing the accuracy rabbit down the most marvelous of rabbit holes.

http://tasvideos.org/EmulatorResources/NESAccuracyTests.html

robert-boehnke · on March 19, 2019

Nice, I wrote an NES emulator in Swift once: https://github.com/robb/NES

souprock · on March 19, 2019

This is the second emulator I'm seen in rust, the other being an ARM emulator done as a personal project by somebody who was an intern here. I do emulators, but in C. (BTW, hiring)

For an expert performance-sensitive C programmer who pulls out all the platform-specific tricks to make stuff go fast, how do you think rust would be? The default safety is appealing, but I have lots of concerns.

Bitfield access looks painful. In C, I can set things up to make read/write named access to bitfields easy. Granted, the header to set this up is complicated: make an union of anonymous structs, each with a bitfield and the needed padding. With that though, I can use bitfields just like ordinary struct members. I can pluck fields out of opcodes for emulation, or I can fill them in for a JIT.

Sometimes in C, one might use gcc's computed goto extension. (getting a void pointer from a unary && operator applied to a label, then derefing it at the goto) It doesn't seem like rust has this, even in unsafe code. Actually, there isn't even a "goto" keyword... which I find to be a worrisome sign that might indicate stubbornly academic language design.

The bounds checking on arrays is kind of the whole point of rust, but if that can't get out of the way for speed then I'd have to make everything unsafe. If I do that, then rust is pointless. Has anybody checked the assembly to see if the compiler is good at eliminating the checks? For example, if I use a 5-bit bitfield to index into a 32-entry table, do the checks get optimized out?

What if I want aliasing? With gcc I can mark things __may_alias__, and with Visual Studio I don't even need that. I had a case where I needed to lay structs over each other like shingles, in groups of 4 with internal padding, so that the same struct member of each of the 4 structs would be adjacent in memory. This was needed so that vector intrinsics could be used.

Speaking of that, are there vector intrinsics? What if I specifically want MMX opcodes in one place (for MMX emulation) and SSE opcodes in another place?

Can I get a switch without a default, such that the compiler doesn't try to generate code for a case that isn't listed? I know this will seem like a horrible idea to many programmers, but sometimes performance matters. When the language can't keep up, I have to drop down into assembly, and that sucks more.

steveklabnik · on March 19, 2019

Rust does not have goto, but it’s not about being academic: a lot of Rust’s compile time checks are flow-based, and unstructured control flow would make them a lot more slow and complex.

Bitfields have a package that makes them easy.

If you want aliasing, there’s UnsafeCell.

Yes, there are vector intrinsics, and they’re part of the language, not a non-standard extension. And there are tools to choose between things too.

Switch requires a default, but you can declare the default unreachable. You should always be able to get the same asm.

sanxiyn · on March 20, 2019

Rust has no strict aliasing, so all Rust types are may_alias.

souprock · on March 20, 2019

I thought it was the other way, all being effectively restrict, with the hazard removed by the borrow checker.

In any case, I want to be able to have both behaviors.

RealityVoid · on March 19, 2019

Tiny off-topic, I was at hackaday Belgrade last year and a very good friend of mine ported a NES emulator on the badge. It was amazing for me seeing him how he solved the issues so quickly and how productive he was. And it was fun fiddling with an emulator on a uncommon embedded target.

tyingq · on March 19, 2019

You mentioned fighting the borrow checker a bit around the emulated CPU address space. Did this come up anywhere else, like frame buffers?

Edit: Ugh. Never mind...not being familiar with NES I didn't recognize what a PPU was :)

MichaelBurge · on March 19, 2019

The borrow checker is mainly a problem when the same data needs to be modified in two different places. Only the PPU writes to a frame buffer(that it owns), so it wasn't a problem there.

The game controllers were another case: They are mapped in CPU address space, so the CPU needs a permanent mutable reference to them. But the top-level also needs to update them with inputs read from my Xbox 360 controller.

muterad_murilax · on March 19, 2019

http://www.michaelburge.us/assets/articles/20190318-nes-emul...

Off-topic: Were the "bumping wheel" tiles ever used in the actual game? If so, in which level?

muterad_murilax · on March 19, 2019

https://tcrf.net/Super_Mario_Bros._3/Unused_graphics#World_8

Apparently not.

domlebo70 · on March 19, 2019

Awesome. Love the blog post. I learnt Haskell using the NES (http://github.com/dbousamra/hnes). Definitely recommend doing your own.

crooked-v · on March 19, 2019

This reminds me that I want to make an NES emulator in JS using Redux, in part to play around with the entire machine state being serializable/deserializable and time-travelable.

juliangoldsmith · on March 19, 2019

Does anyone know if OP would be able to work around the single-ownership issue using Cell/RefCell?

I still haven't fully grasped how to use those, but they seem like they'd fit there.

mtsr · on March 19, 2019

Cell and RefCell are both !Sync, as they don't prevent dataraces. You would still need to wrap them in a Mutex to actually make them threadsafe. But Rc<RefCell<Ppu>> would forego the need for unsafe{} here, at the cost of runtime enforcement of unique mutable access, without making it threadsafe. Or Arc<Mutex<Ppu>> for the thread safe variation.

bfrydl · on March 19, 2019

OP mentions that the use of unsafe code means it's single-threaded, so yes it's likely that RefCell could be used instead of unsafe code.

steveklabnik · on March 19, 2019

I'm frankly a bit worried about that statement; I believe it means that they're producing UB. Even with unsafe, Rust code is expected to uphold the safety rules.

I haven't read the code though.

bfrydl · on March 19, 2019

OP is using unsafe to store a mut pointer to a boxed value so that it can be mutated in two places. So the statement is just referring to data races.

steveklabnik · on March 19, 2019

Sounds like UB to me! I’ll have to take a look. It really depends on the details.

(I tried to run miri on it, but I'm on Windows, and this is unix-only.)

EDIT: even with porting some of the code, I can’t get the current repository to build, there’s a borrow checker error...)

MichaelBurge · on March 19, 2019

Sorry about that. If it helps, I'm using Rust 1.32 to build:

    $ rustc --version
    rustc 1.32.0 (9fda7c223 2019-01-16)
    $ cargo --version
    cargo 1.32.0 (8610973aa 2019-01-02)

    $ cargo build --release
    $ cargo run --release --bin nes-emulator
`

steveklabnik · on March 19, 2019

It’s chill! So, two things:

1. I can send you a patch later, but if you use std::io::stdin() and stdout, instead of opening the file descriptors, it should compile on Windows.

2. The borrow error was coming from miri; it detected some UB! Compiling with cargo does work. It’s not the bit I expected; I can file a bug with the error once I’m back at my laptop.

Anyway, this is a cool project, and I’m glad you’re doing it; don’t let my worries bug you. I’ve been trying to learn more unsafe stuff and so the UB bits are just on my mind more lately.

EDIT: actually it's just from building on nightly rust, not stable. No miri needed. I wonder why stable is okay with this... anyway I filed some github issues, please feel free to ignore me or close them, but if you wanna keep talking about this, let's do that there :)

sidcool · on March 19, 2019

What is something that I can do to learn Rust better? I have been through the book, but that's it. What would help me really learn Rust?

wyldfire · on March 19, 2019

"How do you get to Carnegie Hall?"

Seriously, the best thing to do for virtually all professional or hobby work is to get more experience with it. Choose a project with small enough scope to give you encouraging results to keep that feedback loop going.

sidcool · on March 19, 2019

That is a great analogy thanks.

asdkhadsj · on March 19, 2019

From personal experience, invest in good (for you) resources to learn the language. I had two attempts, and it wasn't until I took "learning" seriously and purchased a book did Rust stick. My first attempt that has worked for all other languages, mostly by just reading docs and programming, was a massive headache.

That was years ago though, perhaps it's better now. All I know is after reading Programming Rust, I picked Rust up and it's been amazing. I love the language.

justwalt · on March 19, 2019

For me, I’ve been rewriting my simple scripts and old school projects in Rust, which has been both enjoyable and informative.

Project Euler problems are pretty trivial (wrt coding) but might have some small useful nuggets. The code advent calendar from 2018 is pretty helpful too, and it’s much more code oriented than PE problems.

I’ve seen people suggest rewriting the GNU coreutils, and I really like the idea, but not quite enough to go sifting through all of the C headers and all to get to the finish line.

Honestly, it probably doesn’t matter much what you start coding as long as you’re coding.

pimeys · on March 19, 2019

What about a command line utility to do some basic Unix tasks? Usually those are the easiest and usually also very fun things to do to learn Rust.

dfee · on March 19, 2019

How cool would it be if computers shipped with a small FPGA to get better performance for stuff like this?

leonardmh · on March 21, 2019

I’ll upvote anything using rust in a cool way