HN2new | past | comments | ask | show | jobs | submitlogin

throw the sucker at a video card, and watch it finish thousands of times faster on cheaper hardware

This is nonsense even for dense operations. But this matrix is sparse, in which case GPUs are within a modest factor (2-5 or so depending on the matrix, whether you use multiple cores on the CPU, and whether you ask NVidia or Intel). And if the algorithm does not expose a lot of concurrency (as with most multiplicative algorithms), the GPU can be much slower than a CPU.



The right answer is in the middle. Of course speed up depends on what are you comparing, but if you benchmark GPU against decent four core CPU the speed up is in order of magnitude.

For dense Matrix currently is about eight times (http://forums.nvidia.com/index.php?showtopic=172176) on CUDA 3.1. That is before the CUDA 3.2 which claim 50-300% performance increase.

For sparse matrix the speed up against multi core CPU is about ten times (http://forums.nvidia.com/index.php?showtopic=83825).

All available as a public libraries (http://developer.nvidia.com/object/cuda_3_2_toolkit_rc.html).

jedbrown, please provide a source of your estimates. Of course I'm interested in some highly optimized libraries like BLAS. Hand written code would be on both systems several times slower.


"On the limits of GPU Acceleration" (short summary from Richard Vuduc): http://vuduc.org/pubs/vuduc2010-hotpar-cpu-v-gpu.pdf

"Understanding the design trade-offs among current multicore systems for numerical computations" (somewhat more technical): http://dx.doi.org/10.1109/IPDPS.2009.5161055

"Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU" (from Intel, but using the best published GPU implementations): http://doi.acm.org/10.1145/1815961.1816021

BTW, you may as well cite CUSP (http://code.google.com/p/cusp-library/) for the sparse implementation, it's not part of CUDA despite being developed by Nathan Bell and Michael Garland (NVidia employees).


Really great comment, thank you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: