HN2new | past | comments | ask | show | jobs | submitlogin

So ~0.85"?

Python is not compiled and start & load pandas (which is comparable to the libraries loaded in the article) in ~0.4" on my computer, and that's a notoriously slow language.

If I were to use e.g. Rust with polars, load time would be virtually none. And when I have to process ~50k different datasets, I can't afford 0.85" per file, which would translate to ~11 hours of overhead.



> If I were to use e.g. Rust with polars, load time would be virtually none.

Because you're compiling...

And if you need to do the same in Julia, you should also pre-compile or some other method like https://github.com/dmolina/DaemonMode.jl (their demo shows loading a database, with subsequent loads after the first one taking roughly ~0.2% of the first)


It's not 0.85 per file. It's 0.85 for the 50k files combined.


No, because the program has to be run by slurm over several compute nodes, so it can't process them all at once.


as long as you don't have 50k computers, it still should be 0.85 seconds per node which is still tiny.


> it still should be 0.85 seconds per node

No it's not, because I will not architecture my whole pipeline & program around Julia inability to start in maybe a second in a year or 1.7" now, I will just use another language.


In good company, that is what Python folks do all the time.


Python does not take 1.7" to load pandas.


> I will just use another language.

Hello C, C++ and Fortran.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: