I'm the author of the post. I'll be happy to answer any of your questions :)

runeks · on March 3, 2017

Not a question, but with regards to:

> * We use all of the five different string types. It is annoying, but it is not a major problem.

cs[1] and the OverloadedStrings extension is all you need, in my experience.

[1] https://www.stackage.org/haddock/lts-8.3/string-conversions-...

cauterize · on March 3, 2017

I like the prototude prelude replacement for its lack of partial functions and toS function to do string conversions

fajpunk · on March 3, 2017

Thanks for the write up! In the beginning you mention that you ran into some bugs in the Python version that would have been caught by the Haskell type checker. Can you go into more detail about what those bugs were?

Ruud-v-A · on March 3, 2017

The most serious one is that we were submitting jobs (as json) that were missing a few metadata fields. In Python we passed around dictionaries, and even though we had json schema validation in place, this slipped through. In Haskell, we define a record type and the corresponding serializers. It is more code, but what you get is that invalid data cannot exist at runtime: it simply cannot be represented.

Also, a compiler refuses to compile your code if you make a typo in a field name.

nerdponx · on March 3, 2017

I'm wondering if integrating MyPy into the build, or even using Cython for performance, could have helped.

Ruud-v-A · on March 3, 2017

We do actually use Mypy! I wrote about my experience with it here:

https://www.reddit.com/r/programming/comments/5x3hdd/channab...

BuuQu9hu · on March 3, 2017

Did you try PyPy? If so, how did it perform? If not, why not? PyPy adoption is at only 1% or so, and it'd be nice to see more PyPy usage.

Ruud-v-A · on March 3, 2017

The real issue with the Python scheduler was its algorithmic complexity. Using a faster implementation would have bought us a few extra months or maybe even year, but it would only have postponed the need for a real solution.

TinyBig · on March 3, 2017

"Our lead developer Robert usually comes in a bit later, so we had about an hour to build a working prototype" - why did you only have an hour? What would have happened if the working prototype was not done when he came in?

arunix · on March 3, 2017

How did you debug the space leaks? Can anything be done to mitigate/avoid them?

Ruud-v-A · on March 4, 2017

The GHC runtime has built-in support for memory profiling, it can produce a graph that shows a breakdown of the heap over time. After trying various combinations of flags I managed to produce a graph where one part was clearly growing over time. The corresponding function was a recursive function with two arguments, the first never changing in recursive calls. I rewrote that to a single-argument nested function, and that made the leak go away.