> is not necessarily a performance improvement over a sync version that just starts more workers and has the OS kernel scheduler organise things
Co-routines are not necessarily faster than threads, but they yield to a performance improvement if one has to spin thousands of them : they have less creation overhead and consume less RAM.
> Co-routines are not necessarily faster than threads, but they yield to a performance improvement if one has to spin thousands of them : they have less creation overhead and consume less RAM.
This hardly matters when spinning up a few thousand threads. Only memory that's actually used is committed, one 4k page at a time. What is 10MB these days? And that is main memory, while it's much more interesting what fits in cache. At that point it doesn't matter if your data is in heap objects or on a stack.
Add to that the fact that Python stacks are mostly on the heap, the real stack growing only due to nested calls in extensions. It's rare for a stack in Python to exceed 4k.
Languages that to green threads don't do them for memory savings, but to save on context switches when a thread is blocked and cannot run. System threads are scheduled by the OS, green threads my the language runtime, which saves a context switch.
Green threads are scheduled by the language runtime and by the OS. If the OS switches from one thread to another in the same process, there is no context switch, really, apart from the syscall itself which was happening anyway (the recv that blocks and causes the switch). At least not on Linux, where I've measured the difference.
This is not what is happening with flask/uwsgi. There is a fixed number of threads and processes with flask. The threads are only parallel for io and the processes are parallel always.
Which is fine until you run out of uwsgi workers because a downstream gets really slow sometime. The point of async python isn't to speed things up, it's so you don't have to try to guess the right number of uwsgi workers you'll need in your worst case scenario and run with those all the time.
Yep and this test being shown is actually saying that about 5 sync workers acting on thousands of requests is faster then python async workers.
Theoretically it makes no sense. A Task manager executing tasks in parallel to IO instead of blocking on IO should be faster... So the problem must be in the implementation.
Co-routines are not necessarily faster than threads, but they yield to a performance improvement if one has to spin thousands of them : they have less creation overhead and consume less RAM.