Limited by VRAM is a huge constraint for me. Even if it is slower, being able to...

hack_ml · on June 3, 2024

There is dask cudf which gets a lot of the way there.

https://docs.rapids.ai/api/dask-cudf/stable/

ashvardanian · on June 3, 2024

Last time I’ve used it, Dask was a lot worse than simple manual batching.

3abiton · on June 3, 2024

This is huge, this was my only gripe with cudf!

xs83 · on June 3, 2024

I did a conversion of 500GB of data using dask_cudf on a GTX 1060 with 6GB of VRAM and was able to do it faster than a 20 node m3.xlarge Cluster.

What you can do on even consumer GPU's is mind blowing.

iamcreasy · on June 3, 2024

How does it perform when it comes to plotting these large data points? Can I use matplotlib?

exDM69 · on June 3, 2024

Correct me if I'm wrong, but with recent GPUs you should just be able to memory map a file on "disk" to the GPU address space?

Open, mmap and then import the host pointer to GPU API (Cuda or Vulkan). Paging should work as expected, but "stupid" access patterns hurt throughput.

This requires a GPU, a CPU and a kernel driver that supports it. E.g. my laptop Intel GPU has the host pointer import Vulkan API but last I checked it does not support pointers imported from mmap.

IanOzsvald · on June 3, 2024

There's a profiler cell magic for Notebooks which helps identify if you run out of VRAM (it says what runs on CPU and GPU). There's an open PR to turn on low-VRAM reporting as a diagnostic. CuDF is impressive, but getting a working setup can be a PITA (and then if you upgrade libraries...). Personality I think it fits in the production pipeline for obvious bottlenecks on well tended configurations, using it in the r&d flow might cost diagnostic time getting and keeping it working (YMMV etc)