You have to keep the stack around in memory, as opposed to having a fixed structure. And fixed structures really pull away when you think about allocation: you can use a segregated fit/free list structure, whereas you can't with a variable sized thing like a stack. If the stack starts small and grows, you're paying the costs of copying and reallocation whenever it does: another loss. Allocation in a segregated fit scheme is an order of magnitude faster than traditional malloc or allocation in a nursery and tenuring.
For these reasons and others, nginx could never be as fast as it is with a goroutine model.
I agree that a fixed structure is more efficient than a growable stack, especially in terms of memory allocation. But I don't understand how you apply this to an evented server. Asynchronous callbacks are often associated to closures. But closures can vary in size, which makes hard to store them in a fixed sized structure. What am I missing?
I haven't read nginx source code, but I guess they are able to store continuations in a fixed structure because they don't use closures and they know in advance what information must be kept. I don't see how this approach can be used as a general purpose concurrency mechanism for a programming language. But I'd like to learn something :-)
This is really interesting, but I feel like Erlang is a counter example. Not sure if that's a good comparison, so I decided to ask and risk sounding stupid.
For these reasons and others, nginx could never be as fast as it is with a goroutine model.