I wouldn't want to be taught in school about low-level optimizations, I am sometimes, and generally, they teach you stuff that is 10 to 20 years old.
I think many low-level optimization tricks each generation of programmers need to find for themselves, the technology changes so much that old tricks are not as useful anymore.
Memory wasn't as slow (compared to processors) as they are today etc. Many people still count FLOPS, while memory accesses is probably the most relevant metric today, especially on GPUs.
My former college switched many years ago to teaching low level concepts on the PLEB, a ARM/Linux computer developed in-house. I would argue they were 10 years ahead of the game, with that platform now being widely commercially deployed.
In fact, the developer of that platform and his professor went on to a startup that provides low-level software components for millions of android phones.
I think many low-level optimization tricks each generation of programmers need to find for themselves, the technology changes so much that old tricks are not as useful anymore.
Memory wasn't as slow (compared to processors) as they are today etc. Many people still count FLOPS, while memory accesses is probably the most relevant metric today, especially on GPUs.