HN2new | past | comments | ask | show | jobs | submitlogin

I have been using Nimrod for some personal projects and internal tools since about version 0.9.0. It is quite usable -- especially the current Github version -- but as you may expect from a 0.x project, it still has some rough edges. These haven't stopped me from being productive with it, but you should be prepared for the occasional "huh?" situation (the most common problem I've had was that when I experimented with and learned the language, malformed code that I inadvertently wrote would trigger a compiler error, e.g. an illegal AST exception).

If you want to do some serious experimentation with Nimrod, I recommend using 0.9.3 (the Github version). It is considerably more mature and stable than 0.9.2.

The one oddity that I'm still getting used to is Nimrod's preference for value types. I.e., the string and seq[T] types use copying for assignment [1] by default, when most other languages assume reference semantics (you can use reference assignment instead, but have to say so explicitly). This may create inadvertent overhead if you are unaware of it, but also avoids some pitfalls you otherwise may run into with mutable strings.

As for the default GC, it's not that you can actually guarantee a maximum pause time [2]. The GC uses deferred reference counting (i.e., eliding most RC operations and batching the ones that are needed). As long as your data structures are acyclic, that allows for a very low upper bound on GC pause times. Pause times for dealing with cyclic structures depend on the size of the cycles involved (and the size of data structures attached to those cycles). A practical workaround for when you have to deal extensively with cyclic data structures is to make use of the fact that each thread has its own heap: thus, putting all time-critical stuff in one thread where cyclic data structures aren't used, and the rest in another thread.

[1] Copying is not used for passing arguments to a procedure or for returning a result, only actual assignment.

[2] You can influence it to some extent by changing the value of ZctThreshold in system/gc.nim, which specifies the size of the zero count table. Each GC iteration will still have to scan the stack, of course.

Edit: Oops, to correct myself, Nimrod does actually have a GC_setMaxPause() procedure, though I still don't think it can guarantee hard upper bounds on pause times.



Wait, if the RC scheme has a cycle collector, how do you ensure there are no CC pauses for acyclic data? Every CC algorithm I know of uses heuristics to determine when to scan the heap for cycles (usually when an RC drops from >2 to 1) and those can result in pauses proportional to the size of the live set on the heap, even when there are no cycles.


I am not sufficiently familiar with the inner details of Nimrod's GC implementation to give you a full answer, but, no, you don't have to scan the entire live heap to reclaim cycles.

The technique is called "trial deletion" and only has to traverse potential cycles. Strictly speaking, a type-agnostic implementation may have to traverse all objects reachable from an object whose reference count was decremented since the previous pass, but in a strongly typed language you can skip that for all objects that can't be part of a cycle. Nimrod at least makes use of that information in asgnRefNoCycle() in system/gc.nim.

Obviously, if everything on the heap is part of one big cycle, then, yes, you'll have to scan the entire heap.


It's still proportional to the live set in the general case. In practice cycle collectors tend to have to scan a lot, leading to some severe pause times (30ms in Firefox used to be common until the ad-hoc ForgetSkippable was added). Heuristics that cycle collectors use tend to fall down a lot in practice, unfortunately.


Thank you for replying. That was a very good overview.

Could you elaborate if possible a bit on a thread private heap. I know Erlang's actor and Dart's isolates allow that (which makes it easy for it to implement a concurrency GC).

What is the mechanism for creating private heaps (if there is one)? Or is it just by convention as in "just know that from this one thread I only create objects and no other thread will access it"?


When a thread is created, it automatically comes with a new thread-local heap. You can send objects to other threads via channels (which will be deep copied to the other heap). Each thread will allocate objects within its own heap by default.

Note that you can also fall back to GC-free allocation with untraced references (using ptr instead of ref) but will then have to manage that part of the memory yourself (i.e., deallocate untraced objects yourself).


I am sold. I really like it!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: