I'd say both Spectre 2 and Meltdown are fairly straightforward to fix in hardware without performance degradation (though probably at the cost of a bit more circuitry), so that's most likely what they did.
For Meltdown, you need to just never load data from L1$ when permissions checks fail. Since address translation is already on the L1$, doing permissions checks there as well is not a big burden. This is quite natural to do, and also why Meltdown specifically affected Intel, and not everybody.
For Spectre 2, you need to add some bits to the branch target buffer to tag entries by the privilege level and ideally PCID (to protect user level processes from each other). And then you just don't use BTB entries with the wrong tag.
It'd still be interesting to know the details, e.g. what exactly they changed for Spectre 2.
Your proposal doesn't fix Spectre 2 in general; just the cross-privilege level (or at best cross-process) aspect. Anyone trying to create a sandbox still has to worry about the sandboxed stuff being able to read all of the sandbox memory, for example.
Ideally the ISA would allow user level code to also toggle a flag that affects the BTB. That would allow embedding JIT code while guarding against Spectre 2 (fixing the BTB injection).
I don't think you really need more than that though. Reading arbitrary memory would seem to require a Spectre 1-type attack instead. Unless you have an explicit counterexample in mind?
For the JIT'ed code, Spectre 1 can probably be mostly avoided using the bitmask trick or something equivalent, which we seem to be moving in the direction of anyway (thinking about the WASM execution environment specifically), and for the "host" code inside the sandbox, the same kind of selective software fixes are required as in the kernel.
I suspect you can get arbitrary memory read with Spectre 2 as well; just have to find the right gadgets. Which in a big program of the sort that would have a sandbox inside is somewhat likely...
(I can't comment on specific counterexamples in more detail than that, even if I know of any.)
Spectre 2 is about code in a less trusted domain A introducing an otherwise impossible branch target used by branch prediction on code in a trusted domain B. So a solution that adds tag bits to the BTB to distinguish between those domains should be a 100% effective fix. The only arguments are about how many tag bits & who gets to set them, i.e. a user-mode JIT sandbox would want to be able to control at least one tag bit for this, but if it can, it'd be fully effective.
Unless you start thinking about protecting different JITted code domains in the same address space from each other... I guess at some point you just have to bite the bullet and invalidate the BTB.
I suppose an additional problem would be if the BTB just "hallucinated" random indirect branch target destinations somehow, which could allow untrusted domain A code to jump (in speculation) to code that would normally be unreachable. A kind of "reverse Spectre 2", if you will, although the same tag approach should fix that at the same time as the original Spectre 2.
Spectre 1 attacks from JIT'ed code against the outside code are definitely possible though.
> The only arguments are about how many tag bits & who gets to set them
That's fair. I guess I've seen no indication that bits will be provided by ISAs and OSes to userland (past the bit that distinguishes userland from kernel code)...
> Unless you start thinking about protecting different
> JITted code domains in the same address space from each
> other
That's also fair. As a browser developer, this is my world right now. ;)
It should (depending on how they implement it), as long as the software is re-modified to run the original way on only this platform and later. The performance degradation at least would move into hardware instead of software.
How are they going to fix it without some performance penalty. If the memory is in cache isn't there going to be a timing difference? or an extra check will have to be made which is going to have to slow down the lookup?
From what I underst and this fixes variants two and three which are cross ring and cross process data extraction. I believe their hardware mitigation is to separate the caches between those (or something to that effect). So there will still be a minor penalty hit for running multiple identical process compared to before.
That's my interpretation though, I don't know enough to say that's the correct interpretation.
Does the hardware on x86 even know which cache is owned by which process? It knows which is elevated (kernel Vs. User Vs. Hypervisor, etc) but process isolation on x86 is largely a kernel API construct rather than a hardware one.
Simplistically, it allows TLB-cached page table contents to be tagged with a
context identifier, and limits the lookups in the TLB to only match within
the currently allowed context. TLB cached entires with a different PCID will
be ignored.
I think you are confusing the TLB and processor data caches.
PCIDs exist for virtual addresses and are implemented in the TLB, not the processor caches. The processor caches are physically indexed, not virtually indexed.
So, does the processor cache have a mechanism to tell which "process" to which a given line belongs? Nope. Two processes (or a process and the kernel) that share memory can be in two PCIDs, but share the processor caches.
You are correct. My assumption was that the original question is not focused on the data cache itself, and so the DTLB does qualify for "knowing" what belongs to which context.
We can't really know, since we don't have something to compare it to. Only they know what the performance could have been if the fix were not implemented in silicon, and they won't be releasing new versions of old chips that ONLY have this hardware fix, so we will never be able to compare the two. Only they will ever know, however if we see comparable or worse performance (or just barely better) in these new chips then we can know that there was some performance hit.
>"As we bring these new products to market, ensuring that they deliver the performance improvements people expect from us is critical. Our goal is to offer not only the best performance, but also the best secure performance."
https://newsroom.intel.com/editorials/advancing-security-sil...
It's doubtful Intel would delay plans too much to work on such subject, and its even more doubtful the result would even be better for anybody: we would have to use the old completely unpatched chips for longer...
It's better for everybody to have more frequent gradual fixes, even if the firsts are not 100% complete.
Would this result in 32bit, x64, x64CPUpost2018 versions of software, as from my limited understanding apps (such as v8 engine) are being compiled with workarounds for such that won't be needed once fixed in hardware?
I'm glad their going to be releasing microcode fixes for older processors, mainly because my hardware still uses one:
> Finally, Intel will also be going even further back with their microcode updates. Their latest schedule calls for processors as old as the Core 2 lineup to get updates, including the 1st gen Core processors (Nehalem/Gulftown/Westmere/Lynnfield/Clarksfield/Bloomfield/Arrandale/Clarkdale), and the 45nm Core 2 processors (Penryn/Yorkfield/Wolfdale/Hapertown), which would cover most Intel processors going back to late 2007 or so.
I'm still waiting to hear if there's ever going to be a hardware mitigation for Spectre variant 1. It's not accurate to call things "fixed" without it. Is anyone even working on it?
Spectre variant 1 is arguably a software problem. It's not about consuming "bad" data that was placed by an adversary like variant 2, or allowing a rogue data cache loads like with Meltdown. It happens as the result of normal execution.
The processor manuals are also very clear that speculation can be wrong and can have observable side-effects[1][2]. The processors also have ways to constrain speculation architecturally (like lfence) in the case that speculation can do something that results in exposure to a vulnerability. Those two things put together make it a software problem _now_, and perhaps forever.
This means that those of us who write security-sensitive software have a long road ahead of us to learn how this affects us and to continue to enhance our defenses. Adding array_index_nospec()'s in Linux, for instance.
1. Intel SDM Vol 3a, 4.10.2.3 Details of TLB Use: "The processor may cache translations required ... for accesses that are a result of speculative execution that would never actually occur in the executed code path."
2. CLFLUSH instruction definition says: "processors are free to speculatively fetch and cache data from system memory regions assigned a memory-type allowing for speculative reads"
This is exactly my fear, that Intel and others will continue to go with the "working as designed" defense. The problem still exists. The design is bad, and users will pay the price with slower software, or less secure software, until the design is fixed.
Disclaimer; I don't work at Intel, nor in any other processor design house, nor in a fab, etc.
And honestly, look at what is variant 1, before spreading uninformed opinion. It is fundamental to speculative execution. You can have some form of hardware support, and actually both Intel, AMD, ARM, and probably others, have begun some work on the subject, but it is in the form of instructions that can be added or replaced in the software at the right spot, because there is absolutely no way to have a speculative enough HW enforce security for variant 1 without some software help: it is not an architectural or a microarchitectural problem; it is a fundamental logic problem...
My opinion is that we need extensible tagging in programing languages quickly, for that, and other subjects.
I understand variant 1, and I do not share your pessimism. It is not fundamental to speculative execution as a concept, only current implementations.
Speculative execution in modern processors already requires performing heroic feats to enforce correctness in the face of mispredicted branches. Fixing Spectre variant 1 requires more heroic feats of a similar kind (rolling back state changes), and I don't think it is fundamentally impossible given what has already been done.
It won't be easy, and it won't be free, but it needs doing. Nobody thought it was worth doing until now, so nobody tried, but that doesn't indicate that it's impossible.
Spectre variant 1 is only really a problem for JITs, and it's mitigated with masking. Disabling speculation to defeat variant 1 is probably not worth the performance cost.
For those of use that keep computers for many years, these flaws certainly put a dampener on new sales. No way I'd buy a brand new computer today with a known flawed/suboptimal CPU, and be lumbered with that for the next 5-10 years.
> No way I'd buy a brand new computer today with a known flawed/suboptimal CPU
I think at some point you have to give up on the notion that most things will (or even can) ever be perfectly optimal. Whether it's performance, security, reliability, or anything else, everything is flawed in one way or another... it's just a question of how relevant the flaws are for your use cases. In this case I'm not too worried until I hear about an actual Spectre-based attack carried out against a random user in the wild.
All I'm saying is that for a certain set of potential customers, this tips the balance from spending money to not spending money. A handful of people inside Intel and computer OEMs are probably spitting feathers about this circumstance. At the present time, I feel no pressure to accept anything.
I'm not exactly sure what's more "nebulous" about the timing for consumer chips. This timing is specified for both the consumer and server chips in the same sentence in the press release: "These changes will begin with our next-generation Intel® Xeon® Scalable processors (code-named Cascade Lake) as well as 8th Generation Intel® Core™ processors expected to ship in the second half of 2018." (I guess the nebulous thing is exactly which 8th generation Cores will have the update.)
I guess the nebulous thing is exactly which 8th generation Cores will have the update.
Indeed. Plenty of 8th gen processors have already shipped so it won't be easy to tell which ones are fixed. Also, 8th gen includes Kaby Lake-R, Coffee Lake, Cannonlake, and maybe Whiskey Lake; I doubt all of those will be fixed. And I was expecting Intel to be selling 9th gen in 2H2018.
I am reading about microcode updates, but I'm wondering if Dell doesn't update the BIOS for my (early Core i3, admittedly pretty old) computer am I SOL regardless of what Intel does?
The microcode updates can also be loaded via your OS if they are not installed via a firmware update. This happens before the OS itself is loaded early on in the boot stage.
See [0] for Linux and possibly [1] for Windows. Not 100% sure if Windows does handle this via Windows Updates or not.
One of the most dangerous practical manifestations of this is in datacenters, though, where you use it to go poking around in other people's VMs.
On the consumer desktop/laptop, there's many other vectors a lot easier to use already, not least of which is plain old social engineering. Right now an in-the-wild virus that tried to use these vulnerabilities would strike me as somebody showboating more than a serious attempt to exploit systems with these methods.
Naturally it goes the other way round: consumer (high end) chips receive new designs first and datacenters get them once they're somewhat battletested.
Yes. Most reports placed time-to-available-hardware in the low years (a whole dev cycle). Perhaps those were early sensationalist reports (this article would suggest they were!), but it is a surprise if you only read them.
Most of the long-horizon reports that I read centered around Spectre 1 (intra-process OOB reads by racing bounds check against conditional branch prediction, currently mitigated by lfence), which is still a fundamental-principles issue that would be exceptionally challenging to solve without either substantial performance loss or some form of userland/compilation-level workaround. Intel still haven't indicated a hardware fix for that one. I expect that most mitigations for this will center around improving static analysis tooling and giving the compiler finer-grained control around fencing to reduce performance impact.
Variant 2 (branch target injection, retpoline style software mitigation) is the most interesting. There are certainly obvious theoretical hardware mitigation strategies like tagged BTB entries and this is the most interesting hardware fix. To me it's very cool but not particularly surprising that Intel believe they'll have pure-hardware mitigations for that soon.
And Meltdown was frankly an Intel mistake that would have been shocking if left unfixed in the next hardware revision.