That's a very nice write up of that kernel exploit. One question — it says at the end that "subtle bugs like this are very difficult to catch via code review." Shouldn't the checking of eax when the very next function uses rax to index a table have been a red flag? Obviously there is more to the exploit so it still wouldn't be obvious, but shouldn't that have been enough?
It's a red flag now, but prior the vulnerability being discovered, it was just one of 46348763487653 possible red flags people could have thought of.
In C, this would be a short/long truncation error, and the compiler would catch it --- except in situations like this, where the variable in question was being used in two different functions. So the idea that it could have been easy to catch in assembly is hard to embrace.
My point was only that if you walked through the functions as they are used (as I would expect a code review to do) the eax/rax difference ought to be sufficient to investigate further. Even if you don't see the exploit, you could still fix the length mismatch.
The mechanism for calling into the kernel is not some half-arsed, barely important piece of code; it's one of the few surfaces the kernel should actually need to defend. A line-by-line walk-through is a lot of effort, but surely not too much on a small amount of extremely important and potentially exploitable code.
Am I expecting too much from Linux or from code reviews in general?
Thanks for the clarification. Given the way this cycle is bound to repeat, do you see any mileage in attempts to use typed assembly languages or other static analysis tools on things like the kernel? I've never seen them as worthwhile for application software, but they could definitely prevent this kind of bug, and if current techniques aren't stopping these bugs, then would it be worth the investment?
Yes. I know people won't like me saying this, but the number of local (user->root or user->kernel) exploits on Linux is staggering. And, given Linus et al.'s policy of silently fixing bugs, probably larger than previously thought.
When does the number of exploits reach "staggering"? Is there a single software project on the planet that doesn't "silently" fix bugs? Are you referring to active exploits, or total historic exploits?
> When does the number of exploits reach "staggering"?
Worse than any other (actually used) OS. Really, Windows may suck, but user->root exploits are rare (possibly because most users run with administrator privileges anyway); the same holds for pretty much any NIX other than Linux.
> Is there a single software project on the planet that doesn't "silently" fix bugs?
We are talking bugs that are known to have security implications here. Yes, this is not the norm.
> Are you referring to active exploits, or total historic exploits?
Say, exploits in the last year. They are* pretty fast about fixing them.