So ~0.85"? Python is not compiled and start & load pandas (which is comparable t...

Nanana909 · on May 14, 2023

> If I were to use e.g. Rust with polars, load time would be virtually none.

Because you're compiling...

And if you need to do the same in Julia, you should also pre-compile or some other method like https://github.com/dmolina/DaemonMode.jl (their demo shows loading a database, with subsequent loads after the first one taking roughly ~0.2% of the first)

adgjlsfhk1 · on May 14, 2023

It's not 0.85 per file. It's 0.85 for the 50k files combined.

shakow · on May 14, 2023

No, because the program has to be run by slurm over several compute nodes, so it can't process them all at once.

adgjlsfhk1 · on May 14, 2023

as long as you don't have 50k computers, it still should be 0.85 seconds per node which is still tiny.

shakow · on May 14, 2023

> it still should be 0.85 seconds per node

No it's not, because I will not architecture my whole pipeline & program around Julia inability to start in maybe a second in a year or 1.7" now, I will just use another language.

pjmlp · on May 14, 2023

In good company, that is what Python folks do all the time.

shakow · on May 14, 2023

Python does not take 1.7" to load pandas.

pjmlp · on May 14, 2023

> I will just use another language.

Hello C, C++ and Fortran.