Hacker News new | past | comments | ask | show | jobs | submit login
Implementing a NES Emulator in Rust (michaelburge.us)
386 points by MichaelBurge on March 19, 2019 | hide | past | favorite | 89 comments



Shameless plug - for anyone that's interested, I also wrote a NES emulator in Rust a few years back: https://github.com/koute/pinky

I even compiled it to WebAssembly: https://koute.github.io/pinky-web/

> The CPU address space has several PPU registers mapped. So the CPU maintains a permanent mutable reference to the PPU. But the top-level Nes object also owns the PPU. I worked around this using Box to assign fixed memory addresses to values, and then “unsafe” pointer dereferences when needed.

There's actually a neat little trick which I used in my emulator which allows you to fulfill Rust's single ownership rule while simultaneously keeping the subcomponents of the emulator independent, making it possible to interconnect them without any unsafe and letting the compiler to optimize the whole thing as everything is statically dispatched.

Basically simplifying a little bit for each component (CPU, PPU, APU, etc.):

1. You put the whole state of the component into a separate `State` structure.

2. You create a `Context` trait which has a `get_state`/`get_state_mut` method as well as various callbacks needed by the component itself (e.g. PPU's `Context` has a `peek_video_memory` so that it can read video memory which is stored externally).

3. You create an `Interface` trait (with `Context` as supertrait) which contains the public interface of a given component (e.g. PPU has `execute`, `peek_ppustatus`, etc.).

4. You put the whole implementation of the component inside of a `Private` trait (with `Context` as supertrait).

5. You do a blanket impl of `Interface` and `Private` for any `T` which implements `Context`.

So then you create a single `Nes` structure which has `cpu::State`, `ppu::State`, etc. inside of it as fields, and implements the `cpu::Context`, `ppu::Context`, etc. traits. Inside of those impls you just wire the various subcomponents together, and voila, every component is encapsulated from each other and yet they can talk with each other, and there is no indirection anywhere (no `Box`es, no `RefCell`s, etc.).


I'm really enjoying your 6502 emulation code. It feels clean and easy to read.

https://github.com/koute/pinky/blob/master/mos6502/src/virtu...

(I imagine the rest is good too, but 6502 is what I know. :-) )


In the Iterators section if you use

  for (idx, x) in xs.iter().enumerate()
instead of

  for (x, idx) in xs.iter().zip(0..xs.len())
you get the 4x unroll (https://godbolt.org/z/XDyx0w).


Why not use serde instead of rolling your own "Savable"?

Also you should use Rc/Arc and Weak (along with RefCell/Mutex if needed) instead of the unsafe pointers. Even better, if possible, move the functions that access multiple components to the object that holds them all, and don't use any "smart pointers". Another possible design is to pass borrowed references explicitly to the methods, and add them to the trait signatures where needed.

Regarding memory access, it's probably fastest to have an hardcoded fast path for RAM and ROM, and then process the rest like now. Lookup tables might help as well (as long as all the data including them fits in cache, of course).


This is very interesting. Do you have a good example to study how this should be done?


So what are your general thoughts on Rust? Would you use it again? How does it compare to using C or C++? Did you enjoy working in it? What was disappointing/motivating about the language?


If you like the subject of emulator and Rust, I suggest you take a look at Ferris youtube : https://www.youtube.com/channel/UC4mpLlHn0FOekNg05yCnkzQ/vid...

He stream is development of a Nintendo 64 and Virtual Boy emulator in Rust.


And here is a video [1] and free to read online book [2] that someone else made about implementing a Game Boy emulator in Rust.

[1]: RustFest Rome 2018 - Ryan Levick: Oh Boy! Creating a Game Boy Emulator in Rust - https://youtu.be/B7seNuQncvU

[2]: DMG-01: How to Emulate a Game Boy - http://blog.ryanlevick.com/DMG-01/


This is amazing. I've long wanted to get into some of this stuff and lately have fallen in love with Rust. I can't wait to watch these videos :D.


If I recall correctly, P.C. Walton, one of the creators of rust, used an NES emulator during rust’s initial programming to measure performance of the language’s output binaries.


Indeed. I peeped at his code[1] frequently while I worked on a GameBoy Original emulator[2] that I've as of yes failed to finish.

[1]: https://github.com/pcwalton/sprocketnes

[2]: https://github.com/zacstewart/gbrs


Question, what made it easier to build this in Rust than other languages? I know the advantages of Rust, but what exactly was the experience in this context?


Implementing an NES emulator is a great way to learn more about a language and the internals of the NES (which is well documented). It's sorta like a rite of passage and has been done in most of the "newer" languages like Nim (https://github.com/def-/nimes) , Go (https://github.com/fogleman/nes)


It's actually "rite of passage" as in latin "ritus".


oops fixed.. thx


Almost: rite _of_ passage (not "to")


argh.. fixed again thx


You have now completed this write of passage.


To become a passage wright.


Well-documented, but many sources are contradictory. For such well researched hardware, there is a lot to be confused about when researching this topic by yourself.


At least there's no shortage of implementations to reference.


The author doesn't really suggest that Rust was chosen for any particular reason. A few benefits are mentioned but they aren't really specific to emulators.


Because why not?


From the article:

"C is probably still a better choice for those cases: It’s more difficult to find someone who can implement a Rust compiler than a C compiler.

But Rust seems usable in any project where C++ is a viable language."


It's probably because implementing an emulator is highly rewarding and easy enough, hence makes it more exciting during learning a new language and NES is extremely well-documented.


It seems that unsafe references were used, so the advantages of Rust appear to have been somewhat relinquished in this case.


I am not sure you can not use `unsafe` in an embedded type of project. `unsafe` isn't necessarily bad, but when used at least makes it very clear where extra attention should be paid.


I also wrote a NES emulator in Rust, and managed to do it without using a single unsafe block. I used SDL bindings for graphics, if that counts, but it can also run in headless mode (for testing) without SDL perfectly well.


Do you have a link? I'd like to see how yours compares with OPs


Not finished, but for what it's worth: https://github.com/nukeop/mr-cool-nes


Is this named after Mr Cool Ice?


you know it


Writing an emulator is not really embedded programming (the target often qualifies as an embedded environment but the host can be any random desktop computer).

It's definitely possible to write an emulator without unsafe code. Here's a toy gameboy emulator I wrote a while ago in Rust without any unsafe code: https://github.com/simias/gb-rs

Here's a very incomplete PSX emulator with only a single unsafe (which might not even be necessary anymore, I haven't updated the code in a while): https://github.com/simias/rustation



The SDL2 dependency? I don't understand what you mean by that.

Or are you saying that a dependency that I use happens to use unsafe code? In which case it's true but then by that definition it's effectively almost impossible to write 100% safe rust since the stdlib itself contains a non-negligible amount of unsafe code.


It’s always impossible, as you have to call into the OS at some point to do anything of use.


Emulating a system of a few 2Mhz with a 4x3Ghz chipset ought to be doable using managed memory.


Nitpicking a bit, but there's not a lot of useful things that could be multithreaded in an 8-bit emulator, so it would be more like 1x3Ghz ;)

In those old 8- and 16-bit machines all components usually ran completely synchronous for each clock tick, for instance if the video emulation runs out of sync with the CPU emulation for a tick or two, a write from the CPU to a video hardware register might not make it in time and you get garbage video output.

Modern asynchronous systems may need less (relative) host system performance for emulation, because the timing requirements between the different hardware components are much more relaxed.


The original hardware was multithreaded... in the sense that there were several separate processing units.


Yes, but as far as I can glance from the NES spec, those still run from the same clock (with different dividers, but they're still locked to each other).

I haven't written an emulator for the NES yet, but on machines like the Amstrad CPC or C64, the CPU can reprogram video hardware registers (like color palette entries) at any time, for instance in the middle of a scanline. When this is off by a tick, you get the color palette change a couple of pixels late or early.

If CPU and video chip emulation would run on different threads, they would need to sync with each other a few million times per second, that doesn't sound like a good idea. If synchronization is only needed once per scanline, or even per frame that's an option of course. Often this is good enough to run some games which didn't go too close to the metal (again this is only from my experience with home computer emulators, haven't done NES stuff yet).

Parallelizing through SIMD might be an option though (one can pack a lot of 4- or 8-bit counters into a 512-bit register).


Yes, but their emulation needs to be performed in lockstep, normally per clock tick or at least externally visible state change, to accurately model their interaction with respect to timing.

Threading in that case would likely mean several orders of magnitude of performance degradation.


Except that an explicit goal of the project was to run simulations much faster than real-time to train an AI to play the game, so they still wanted to squeeze the the maximum performance out of the program.


unsafe exists in Rust for when you need it. When it is used, it means nothing more than you’re writing code for which you must ensure the safety instead of the compiler.

The times when unsafe is poorly used are when it’s being abused to ignore safety related compilation issues, like lifetimes and/or mutability. Though, even in those cases it might be ok if you’re using lower level concurrency primitives to enforce those guarantees.

unsafe does not remove all advantages of Rust, it only removes a few restrictions. It should be avoided unless necessary.


It doesn’t remove restrictions, it adds unchecked constructs. Very different!


Hm, I’m not sure I see the difference between those concepts, but I do like your framing better.


The main point is that if you take safe code and put it inside an unsafe block, the meaning of the code doesn't change. So for example, array indexing is bounds-checked in safe code, and non-bounds-checked array access is only available in unsafe blocks. This does not mean that the call `a[i]` loses bounds-checking in an unsafe block. It has the same semantics regardless of context. Rather, the separate call `a.get_unchecked(i)` is available inside an unsafe block and cannot be called in safe code.


What my sibling said, but also, that unsafe doesn’t mean you’re free from rules; where safe and unsafe interact, your unsafe code is expected to uphold the safety guarantees that the safe code expects.

This matters because of stuff like the article; if you see the other thread, I’m pretty sure (though haven’t verified yet) that this code has UB, even though they’re using unsafe.


Right, which is why I phrased it as a “few restrictions”, I do understand the point you’re making.

I suppose the reason I consider it restrictions is due to things like “in safe code you can not work with raw pointers”. That seems like a restriction to me, but I can see why saying that way is less accurate.


If you use unsafe Rust, you do give up the advantage of being able to say, "Look at this code, no unsafe, therefore no UB." However, the lifetime and ownership model is still doing a lot for you. You get a lot of control over the way that your safe code (hopefully most of your program) can interact with your unsafe code. For example if I have this function (partially copied from listd):

    fn reverse_slice<T>(slice: &mut [T]) {
        let len = slice.len();
        for i in 0..len / 2 {
            // Unsafe swap to avoid the bounds check in safe swap.
            unsafe {
                let pa: *mut T = slice.get_unchecked_mut(i);
                let pb: *mut T = slice.get_unchecked_mut(len - 1 - i);
                std::ptr::swap_nonoverlapping(pa, pb);
            }
        }
    }
In theory, I could use those unsafe pointers to alias the same value and cause UB. Or I could stash them somewhere that outlives the lifetime of what they're pointing to, and cause UB later on that way. So I have to audit the function carefully to avoid doing those things. But at the same time (in the absence of other broken unsafe code in the caller), this function can rely on a lot of guarantees:

- The `slice` reference is guaranteed to be unique. No other code, on any thread, has read or write access to `slice` while this function is running.

- The length of `slice` is guaranteed to be correct. All of the memory it refers to has been properly initialized, and computing pointer offsets into it cannot overflow `usize` or otherwise cause UB. (This guarantee is surprisingly subtle, because it means the maximum length of a slice is `isize::MAX` rather than `usize::MAX`.)

- The type `T` is guaranteed to be safe to move. It doesn't contain any internal references to its own memory, which moving would invalidate.

All of this put together makes it possible to audit this function by itself, to make sure safe code can't use it to trigger UB.


To declare the advantages reliquinshed requires some kind of discussion of the `unsafe` blocks' scope. How many, how much code?

Regardless, I'm not even sure I'd say that memory safety is Rust's biggest advantage. It's certainly the headliner and it's critically important. But it's a really appealing package overall IMO.


One of Rust's advantages is that you can drop into unsafe to do this sort of thing.


There's also sprocketnes, which was originally written ~6 years ago by pcwalton, who works on rust/servo at Mozilla.

It's been kept up to date and it's interesting to see how the project evolved along with the language:

https://github.com/pcwalton/sprocketnes


While the project itself is a great resource, the dependencies marked as "*" are unfortunate for a project that is 6 years in the making.


What was the most challenging part for you in building this emulator?


It took 14 days to get it functionally-complete, where I could beat Mario using savestates.

The PPU took 7 days, the CPU took 5 days, the APU took 1 day, and there's a spare day for everything else.

The CPU was easy to make: Its natural unit of output is a single CPU instruction. Even if there are many details, it's a straightforward process to compare an instruction-by-instruction log with a reference log until there are no differences.

The PPU was harder because the natural unit of output is an entire frame: It's not any easier to get the first pixel correct, than to get them all correct. So almost everything needs to be in place before you can validate its output.


That's impressive. A few follow up questions:

1. Was this your first NES emulator?

2. Your first project in Rust?


Yes to both of these. I usually use C for projects like these, and wanted to see how Rust compared.


Follow up to that: Did you write any other emulators in the past?


I want to build one myself as well, but I don't want to use existing code as a guide, rather challenge myself to implement from raw specs.

The problem is I haven't really found any yet that helped me (admittedly, I have so far devoted just one Saturday morning to it.) Are there any specs about the hardware that you used, or that anyone else can share?


> Are there any specs about the hardware that you used, or that anyone else can share?

Everything you need is either on http://wiki.nesdev.com or on their forums. I know because that's what I did with my NES emulator - I explicitly didn't want to look at any source code and instead wanted to implement everything only based on the docs.

Granted, what's on the NESdev isn't always easy to grok, up to the point of being really confusing sometimes. What I've found really helped is gradually setting up a test suite based on various test ROMs I could find, which even allows you to implement some parts of the emulator TDD-style once you get the basics up and running.


Shoutout to TASVideos here, who maintain a fairly up to date list of test ROMs for various hardware features. Once you have the emulator basics down (working well enough, say, to run Super Mario Bros, which is surprisingly demanding) you can throw blargg's tests at your emulator and start chasing the accuracy rabbit down the most marvelous of rabbit holes.

http://tasvideos.org/EmulatorResources/NESAccuracyTests.html


Nice, I wrote an NES emulator in Swift once: https://github.com/robb/NES


This is the second emulator I'm seen in rust, the other being an ARM emulator done as a personal project by somebody who was an intern here. I do emulators, but in C. (BTW, hiring)

For an expert performance-sensitive C programmer who pulls out all the platform-specific tricks to make stuff go fast, how do you think rust would be? The default safety is appealing, but I have lots of concerns.

Bitfield access looks painful. In C, I can set things up to make read/write named access to bitfields easy. Granted, the header to set this up is complicated: make an union of anonymous structs, each with a bitfield and the needed padding. With that though, I can use bitfields just like ordinary struct members. I can pluck fields out of opcodes for emulation, or I can fill them in for a JIT.

Sometimes in C, one might use gcc's computed goto extension. (getting a void pointer from a unary && operator applied to a label, then derefing it at the goto) It doesn't seem like rust has this, even in unsafe code. Actually, there isn't even a "goto" keyword... which I find to be a worrisome sign that might indicate stubbornly academic language design.

The bounds checking on arrays is kind of the whole point of rust, but if that can't get out of the way for speed then I'd have to make everything unsafe. If I do that, then rust is pointless. Has anybody checked the assembly to see if the compiler is good at eliminating the checks? For example, if I use a 5-bit bitfield to index into a 32-entry table, do the checks get optimized out?

What if I want aliasing? With gcc I can mark things __may_alias__, and with Visual Studio I don't even need that. I had a case where I needed to lay structs over each other like shingles, in groups of 4 with internal padding, so that the same struct member of each of the 4 structs would be adjacent in memory. This was needed so that vector intrinsics could be used.

Speaking of that, are there vector intrinsics? What if I specifically want MMX opcodes in one place (for MMX emulation) and SSE opcodes in another place?

Can I get a switch without a default, such that the compiler doesn't try to generate code for a case that isn't listed? I know this will seem like a horrible idea to many programmers, but sometimes performance matters. When the language can't keep up, I have to drop down into assembly, and that sucks more.


Rust does not have goto, but it’s not about being academic: a lot of Rust’s compile time checks are flow-based, and unstructured control flow would make them a lot more slow and complex.

Bitfields have a package that makes them easy.

If you want aliasing, there’s UnsafeCell.

Yes, there are vector intrinsics, and they’re part of the language, not a non-standard extension. And there are tools to choose between things too.

Switch requires a default, but you can declare the default unreachable. You should always be able to get the same asm.


Rust has no strict aliasing, so all Rust types are may_alias.


I thought it was the other way, all being effectively restrict, with the hazard removed by the borrow checker.

In any case, I want to be able to have both behaviors.


Tiny off-topic, I was at hackaday Belgrade last year and a very good friend of mine ported a NES emulator on the badge. It was amazing for me seeing him how he solved the issues so quickly and how productive he was. And it was fun fiddling with an emulator on a uncommon embedded target.


You mentioned fighting the borrow checker a bit around the emulated CPU address space. Did this come up anywhere else, like frame buffers?

Edit: Ugh. Never mind...not being familiar with NES I didn't recognize what a PPU was :)


The borrow checker is mainly a problem when the same data needs to be modified in two different places. Only the PPU writes to a frame buffer(that it owns), so it wasn't a problem there.

The game controllers were another case: They are mapped in CPU address space, so the CPU needs a permanent mutable reference to them. But the top-level also needs to update them with inputs read from my Xbox 360 controller.


http://www.michaelburge.us/assets/articles/20190318-nes-emul...

Off-topic: Were the "bumping wheel" tiles ever used in the actual game? If so, in which level?



Awesome. Love the blog post. I learnt Haskell using the NES (http://github.com/dbousamra/hnes). Definitely recommend doing your own.


This reminds me that I want to make an NES emulator in JS using Redux, in part to play around with the entire machine state being serializable/deserializable and time-travelable.


Does anyone know if OP would be able to work around the single-ownership issue using Cell/RefCell?

I still haven't fully grasped how to use those, but they seem like they'd fit there.


Cell and RefCell are both !Sync, as they don't prevent dataraces. You would still need to wrap them in a Mutex to actually make them threadsafe. But Rc<RefCell<Ppu>> would forego the need for unsafe{} here, at the cost of runtime enforcement of unique mutable access, without making it threadsafe. Or Arc<Mutex<Ppu>> for the thread safe variation.


OP mentions that the use of unsafe code means it's single-threaded, so yes it's likely that RefCell could be used instead of unsafe code.


I'm frankly a bit worried about that statement; I believe it means that they're producing UB. Even with unsafe, Rust code is expected to uphold the safety rules.

I haven't read the code though.


OP is using unsafe to store a mut pointer to a boxed value so that it can be mutated in two places. So the statement is just referring to data races.


Sounds like UB to me! I’ll have to take a look. It really depends on the details.

(I tried to run miri on it, but I'm on Windows, and this is unix-only.)

EDIT: even with porting some of the code, I can’t get the current repository to build, there’s a borrow checker error...)


Sorry about that. If it helps, I'm using Rust 1.32 to build:

    $ rustc --version
    rustc 1.32.0 (9fda7c223 2019-01-16)
    $ cargo --version
    cargo 1.32.0 (8610973aa 2019-01-02)

    $ cargo build --release
    $ cargo run --release --bin nes-emulator
`


It’s chill! So, two things:

1. I can send you a patch later, but if you use std::io::stdin() and stdout, instead of opening the file descriptors, it should compile on Windows.

2. The borrow error was coming from miri; it detected some UB! Compiling with cargo does work. It’s not the bit I expected; I can file a bug with the error once I’m back at my laptop.

Anyway, this is a cool project, and I’m glad you’re doing it; don’t let my worries bug you. I’ve been trying to learn more unsafe stuff and so the UB bits are just on my mind more lately.

EDIT: actually it's just from building on nightly rust, not stable. No miri needed. I wonder why stable is okay with this... anyway I filed some github issues, please feel free to ignore me or close them, but if you wanna keep talking about this, let's do that there :)


What is something that I can do to learn Rust better? I have been through the book, but that's it. What would help me really learn Rust?


"How do you get to Carnegie Hall?"

Seriously, the best thing to do for virtually all professional or hobby work is to get more experience with it. Choose a project with small enough scope to give you encouraging results to keep that feedback loop going.


That is a great analogy thanks.


From personal experience, invest in good (for you) resources to learn the language. I had two attempts, and it wasn't until I took "learning" seriously and purchased a book did Rust stick. My first attempt that has worked for all other languages, mostly by just reading docs and programming, was a massive headache.

That was years ago though, perhaps it's better now. All I know is after reading Programming Rust, I picked Rust up and it's been amazing. I love the language.


For me, I’ve been rewriting my simple scripts and old school projects in Rust, which has been both enjoyable and informative.

Project Euler problems are pretty trivial (wrt coding) but might have some small useful nuggets. The code advent calendar from 2018 is pretty helpful too, and it’s much more code oriented than PE problems.

I’ve seen people suggest rewriting the GNU coreutils, and I really like the idea, but not quite enough to go sifting through all of the C headers and all to get to the finish line.

Honestly, it probably doesn’t matter much what you start coding as long as you’re coding.


What about a command line utility to do some basic Unix tasks? Usually those are the easiest and usually also very fun things to do to learn Rust.


How cool would it be if computers shipped with a small FPGA to get better performance for stuff like this?


I’ll upvote anything using rust in a cool way




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: