Correct me if I'm wrong, but with recent GPUs you should just be able to memory map a file on "disk" to the GPU address space?
Open, mmap and then import the host pointer to GPU API (Cuda or Vulkan). Paging should work as expected, but "stupid" access patterns hurt throughput.
This requires a GPU, a CPU and a kernel driver that supports it. E.g. my laptop Intel GPU has the host pointer import Vulkan API but last I checked it does not support pointers imported from mmap.
There's a profiler cell magic for Notebooks which helps identify if you run out of VRAM (it says what runs on CPU and GPU). There's an open PR to turn on low-VRAM reporting as a diagnostic.
CuDF is impressive, but getting a working setup can be a PITA (and then if you upgrade libraries...).
Personality I think it fits in the production pipeline for obvious bottlenecks on well tended configurations, using it in the r&d flow might cost diagnostic time getting and keeping it working (YMMV etc)
Unless cudf has implemented some clever dask+cudf kind of situation which can intelligently push data in/out of GPU as required?