x86 assembly is surprisingly good at ray casting/tracing in tiny constraints - I know, it's unexpected! The reason is that the 8087 FP co-processor implements a stack machine with single byte instructions for common floating point operations. Have a look at the insanely great demo Pyrit (https://www.pouet.net/prod.php?which=78045) which includes documented source code for a simple raytracer in 256 bytes.