This would be bad. Not because of the CPU business - I think RISC V will eventua...

Nokinside · on July 31, 2020

I think you are confused of what RISC-V is and how these things work.

RISK-V is just ISA.

You will have open source RISK-V microarchitectures for processes that are few generations old. You can use the same design long time when the performance is not so important.

You will not get open source optimized high performance microarchitectures for the latest process and large volumes. These cost $100s of millions to design and the work is repeated every few years for a new process. Every design is closely optimized for the latest fab technology. They have patents.

Intel, AMD, Nvidia, ARM, all have to design new microarchitectures every few years. It's not just doing some VHDL design. It involves research, pathfinding, large scale design, simulation, verification and working closely with fab business. The software alone costs millions and part of it is redesigned. Timelines are tight or the design becomes outdated over time.

duck2 · on July 31, 2020

"Building" for the latest process and large volumes is another story, but as far as I can see, large scale logic design is something not _that_ far away from software. Large scale, open source, and performant software designs exist in the wild. (see Linux, llvm, ...)

Why wouldn't we get a logic netlist which could perform reasonably well when placed on silicon by people who know what they are doing? (Yeah, lots of handwave.) I'm asking this out of curiosity. Not an expert in the field by any means.

phkahler · on Aug 1, 2020

SonicBoom is an open source Risc V core that is clock for clock competitive with the best from ARM. There is still the issue of matching clock speed and tweaking for a given process.

Foundries actually have an interest in helping to optimize an open core for their process as a selling point since it can be reused by multiple customers.

skavi · on Aug 2, 2020

Their own paper[0] shows it performing at similar levels to the A72, a four year old core. Those are obviously really impressive results for an open source core.

The best ARM core is Apple’s Lightning, which has an IPC rate about 4X that of the A72. [0]: https://carrv.github.io/2020/papers/CARRV2020_paper_15_Zhao....

phkahler · on Aug 2, 2020

Do you have a reference the Apples IPC?

skavi · on Aug 5, 2020

I was using AnandTech’s SPEC2006 benchmarks.

paulpan · on July 31, 2020

Agreed, this reminds of the AMD acquisition of ATI a couple decades ago. Almost killed AMD and now Nvidia is thinking about doing the same? Technically the reverse since Nvidia is the GPU company acquiring a CPU company.

At the same time, Nvidia may be trying to hedge its future as its other competitors (Intel, AMD, even Apple) all have their own CPU and GPU designs. The animosity with Apple has shown the power dynamics and the high stakes.

ginko · on July 31, 2020

I believe grandparent is referring to ARM's Mali GPU.

IntelMiner · on July 31, 2020

Apple's CPU's are still ARM64 cores at heart, even if heavily modified

A thought I've been kicking about, though I fully understand it to be incredibly unlikely would be if NVIDIA would simply terminate any license Apple has to use ARM at all. The move would arguably be done out of pure spite. "payback" for the GeForce 8600M issues that cost them 200~ million and $3 billion in market cap in part due to Apple pushing for a recall.

Apple also seemingly pushed Nvidia utterly out the door, going as far as to completely block NVIDIA from providing drivers for users to use their products even in "eGPU" external enclosures on newer versions of MacOS. Even if only a minority of Apple users ever bought Nvidia cards, being completely banned from an OEM's entire lineup would likely ruffle feathers

valuearb · on July 31, 2020

Let’s ignore that founding ARM investor Apple has a perpetual license and their own CPU designs.

Do you really think NVidia is going to substantially overpay for ARM to then nuke it’s remaining value by going to war on it’s licensees?

satya71 · on July 31, 2020

Apple has an "ARM Architecture License". They have a custom implementation that's very different from the reference design (which is why they're miles ahead of other ARM CPU makers). I'm sure the license has contractual obligations that protects Apple. In short, Apple is in no danger, most likely.

imtringued · on July 31, 2020

You're swallowing the poison pill with pride. The reason why no ARM customer actually wants to buy ARM is because there is almost no way to make money off of it. If you cancel licenses ARM is going to implode and $32 billion worth of value locked up in ARM will vanish into thin air. If you want to extract value out of ARM you'll have to play nice and that is definitively not worth $32 billion either.

fulafel · on July 31, 2020

Aren't you mixing ISA with implementation here? Or do Apple's CPUs really contain Arm's logic contrary to public info eg on Wikipedia?

Teknoman117 · on July 31, 2020

You have to license the ARM ISA as well, you can't just freely implement it. IANAL, but as far as US law is concerned, I don't think you're violating copyright by cloning an ISA, but if that ISA is covered by any patents, you'd be fairly screwed without licensing. ARM definitely considers there to be enforceable patents on their ISA (just follow their aggressive assault on any open source HDL projects that are ARM compatible).

If Nvidia bought ARM and decided to find some legal way to terminate the contract with Apple out of spite, Apple would have to find another ISA for their "Apple Silicon".

jlokier · on July 31, 2020

It's not just patents.

ARM has previously shut down open source software emulation of parts of the ARM ISA.

For a while QEMU could not implement ARMv7 I think, until changed their mind and started permitting it. There was an open source processor design on opencores.org that got pulled too.

The reasoning was something like "to implement these instructions you must have read the ARM documentation, which is only available under a license which prohibits certain things in exchange for reading it".

swores · on July 31, 2020

If Apple are worried about that prospect, couldn't they easily outbid Nvidia to buy ARM?

ptspin · on July 31, 2020

Theoretically Apple could probably outbid Nvidia, but realistically regulators would never let Apple make that purchase so it's irrelevant that they have the cash to do so.

swores · on July 31, 2020

I'm not particularly familiar with US antitrust law specifics...

One big point that came up in the congressional hearing the other day was how Google, when buying DoubleClick, said (under oath to congress) that not only would they not merge data but that they legally couldn't if they wanted to - and then years later did just that.

Is there any way to acquire in such a way that Apple would own ARM but there'd be a complete firewall between them, with ARM having a separate board, CEO, etc. and nothing between them except the technical ownership (and any contracts between the two companies)?

I hope not, as I'm in favour of breaking up huge companies myself... but if legal firms can have a system in place for firewall of information between clients I don't see why a similar legal situation could be feasible for allowing Apple to buy on the condition that they can't have any say over operations, with selling ARM being their only way of influencing them in any way?

Of course, owning a company where you have no control at all isn't great, but in this case it might be worth it if Apple trusts ARM to keep doing well without Apple's help, and if it would prevent someone like NVIDIA from shutting Apple out.

And would there be any antitrust issues if Apple bought a 500 year license to freely (or at a pre-set calculation of pricing) use any and all current and future ARM designs?

Teknoman117 · on July 31, 2020

Bloomberg says that SoftBank approached Apple before they approached Nvidia, but who knows.

swores · on July 31, 2020

I should have read the article before commenting! But I guess that means Apple aren't worried, either because their contracts are solid enough and/or they already have backup plans ready that are cheaper than buying ARM.

(Or, of course, Apple are making a huge mistake. But seems a bit less likely to me.)

justaguy88 · on July 31, 2020

Or just buy Nvidia

valuearb · on July 31, 2020

It would be like a gas station refusing to sell gas. Utter suicide for ARM.

IntelMiner · on July 31, 2020

To my knowledge, "Apple Silicon" or what ever you want to call it is an ARM64 ISA, with Apple extensions

Third parties have been able to get IOS running in an emulator for security research and Android(!) has even been ported to the IOS devices that contain the "Checkrain" Boot ROM exploit (though with things like GPU, Wi-Fi etc in varying states of completion)

fulafel · on July 31, 2020

ISA is just about the instruction set that the silicon is designed to interpret. Aww compatibility does not imply any shared heritage in the silicon between apple and arm LTD cores. Just like AMD and Intel share no silicon design even though they are both sell processors implementing the AMD64 ISA.

(The silicon implementation of an isa is referred to as the microarchitecture btw)

Teknoman117 · on July 31, 2020

Right, but the ARM ISA is covered by various patents, so you have to get a license for it. ARM has aggressively shut down any open efforts to make ARM compatible cores (without licenses).

athrowaway3z · on July 31, 2020

Why wouldn't something similar to RISC V happen in the GPU space?

Symmetry · on July 31, 2020

In a GPU the ISA isn't decoupled from the architecture in the way it is for a post-Pentium Pro CPU. Having a fixed ISA that you couldn't change later when you wanted to make architectural changes would be something of a millstone to be carrying around for a GPU.

QuixoticQuibit · on July 31, 2020

I’m curious, why is this the case for GPUs and not CPUs?

simcop2387 · on July 31, 2020

It's much more advantageous to be able to respin/redesign parts of the GPU for a new architecture since the user interface is at a much much higher level compared to a CPU. They basically only have to certify that it'll be API compatible at CUDA/OpenCL/Vulkan/OpenGL/DirectX level and no more. All of those APIs specify that the drivers are responsible for turning it into the hardware language, so every program is already re-compiled for any new hardware. This does lead to tiny rendering differences in the end (it shouldn't but it frequently does, due to bug fixes and rounding changes). So because they aren't required to keep that architectural similarity anymore, they're free to change as they need new features or come up with better designs (frequently to allow more SIMD/MIMD style stuff, and greater memory bandwidth utilization). I doubt they really change all that much between two generations, but they change enough that exact compatibility isn't really worth working at.

If you want to look at some historical examples where this wasn't quite the case, look at the old 3DFX VooDoo series. They did add features but they kept compatibility to the point where even up to a VooDoo 5 would work with software that only supported the VooDoo 1. (n.b. This is based on my memory of the era, i could be wrong). They had other business problems, but it meant that adding completely new features and changes in Glide (their API) was more difficult.

mhh__ · on July 31, 2020

RISC-V is an ISA rather than silicon, GPUs are generally black boxes that you throw code at. There's not much to standardize around

shmerl · on July 31, 2020

AMD document their ISA: https://llvm.org/docs/AMDGPUUsage.html#additional-documentat...

That's why third party open shader compilers like ACO could be made:

https://gitlab.freedesktop.org/mesa/mesa/-/tree/master/src/a...

moonchild · on July 31, 2020

As does intel - https://01.org/linuxgraphics/documentation/hardware-specific...

makomk · on July 31, 2020

AMD document their ISAs, but each one maps pretty much one-to-one onto a particular implementation. Compatibility and standardization are not goals.

shmerl · on July 31, 2020

As long as they make an open source compiler, there is at least a reference implementation to compare to.

mumblemumble · on July 31, 2020

GPUs do have ISAs. It's just that they're typically hidden behind drivers that provide a more standardized API.

mhh__ · on July 31, 2020

Of course they have ISAs, my point is that the economics of standardization around a single ISA a la RISC-V isn't as good by virtue of the way we use GPUs on today's computers. You could make GPU-V but why would a manufacturer use it

jshap70 · on July 31, 2020

> GPUs are generally black boxes that you throw code at.

umm... what? what does that even mean? lol

I could kind of maybe begin understand your argument from the Graphics side, as users mostly interact with it at an API level, however keep in mind that shaders are languages the same way "cpu languages" work. It's all still compiled to assembly, and there's no reason that you couldn't make an open instruction set for a GPU the same as a CPU. This is especially obvious when it comes to Compute workloads, as you're probably just writing "regular code".

Now, that said, would it be a good idea? I don't really see the benefit. A barebones GPU ISA would be too stripped back to do anything at all, and one with the specific accelerations needed to be useful will always want to be kept under wraps.

kmeisthax · on July 31, 2020

Just 'cause Nvidia might want to keep architectural access under wraps doesn't necessarily mean that everyone else is going to, or that they have to in order to maintain a competitive advantage. CPU architectures are public knowledge, because people need to write compilers for them, and there are still all sorts of other barriers to entry and patent protections that would allow maintaining competitive advantage through new architectural innovations. This smells less of a competitive risk and more of a cultural problem.

I'm reminded of the argument over low-level graphics APIs almost a decade ago. AMD had worked together with DICE to write a new API for their graphics cards called Mantle, while Nvidia was pushing "AZDO" techniques about how to get the best performance out of existing OpenGL 4. Low-level APIs were supposed to be too complicated for graphics programmers for too little benefit. Nvidia's idea was that we just needed to get developers onto the OpenGL happy path and then all the CPU overhead of the API would melt away.

Of course, AMD's idea won, and pretty much every modern graphics API (DX12, Metal, WebGPU) provides low-level abstractions similar to how the hardware actually works. Hell, SPIR-V is already halfway to being a GPU ISA. The reason why OpenGL became such a high-overhead API was specifically because of this idea of "oh no, we can't tell you how the magic works". Actually getting all the performance out of the hardware became harder and harder because you were programming for a device model that was obsolete 10 years ago. Hell, things like explicit multi-GPU were just flat-out impossible. "Here's the tools to be high performance on our hardware" will always beat out "stay on our magic compiler's happy path" any day of the week.

mhh__ · on July 31, 2020

You could make a standardized GPU instruction set but why would anyone use it? We don't currently access GPUs at that level, like we do with the CPU.

It's technically possible but the economics isn't there (was my point). The cost of making a new GPU generally includes writing drivers and shader compilers anyway, so there's not much of a motivation to bother complying with a standard. It would be different if we did expose them at a lower level (i.e. if CPU were programmed with a jitted bytecode then we wouldn't see as much focus on ISA as long as the higher level semantics were preserved)

sprash · on July 31, 2020

SPIR-V looks like a promising standardization. It can not be translated directly into silicon but it doesn't have to. Intel also essentially emulates x86 and runs RISC internally.

imtringued · on July 31, 2020

>Intel also essentially emulates x86 and runs RISC internally.

By that logic anything emulates its ISA because that is the definition of an ISA. An ISA is just the public interface of a processor. You are wrong about what x86 processors run internally. Several micro ops can be fused into a single complex one which is something that cannot be described with a term from the 60s. Come on, let the RISC corpse rot in peace. It's long overdue.

baybal2 · on July 31, 2020

GPUs are also much simpler chips in comparison to CPUs.

90%+ of the core logic area (stuff that is not i/o, power, memory, or clock distribution) on the GPU are very basic matrix multipliers.

They are in essence linear algebra accelerator. Not much space for sophistication there.

All best possible arithmetic circuits, multipliers, dividers, etc. are public knowledge.

raphlinus · on July 31, 2020

I've been studying and blogging about GPU compute for a while, and can confidently assert that GPUs are in fact astonishingly complicated. As evidence, I cite Volume 7 of the Intel Kaby Lake GPU programmers manual:

https://01.org/sites/default/files/documentation/intel-gfx-p...

That's almost 1000 pages, and one of 16 volumes, it just happens to be the one most relevant for programmers. If this is your idea of "simple," I'd really like to see your idea of a complex chip.

baybal2 · on July 31, 2020

The most complex circuit on the GPU would be the thing chops the incoming command stream, and turns it into something which matrix multiplicators can work.

raphlinus · on July 31, 2020

I get the feeling you're only really thinking about machine learning style workloads. Your statement doesn't seem to take into account scatter/gather logic for memory traffic (including combine logic for uniforms), resolution of bank conflicts, sorting logic for making blend operations have in-order semantics, the fine rasterizer (which is called the "crown jewels of the hardware graphics pipeline" in an Nvidia paper), etc. More to the point, these are all things that CPUs don't have to deal with.

Conversely, there is a lot of logic on a modern CPU to extract parallelism from a single thread, stuff like register renaming, scoreboards for out of order execution, and highly sophisticated branch prediction units. I get the feeling this is the main stuff you're talking about. But this source of complexity does not dramatically outweigh the GPU-specific complexity I cited above.

RantyDave · on July 31, 2020

Isn't the rasteriser "simply" a piece of code running on the GPU?

raphlinus · on July 31, 2020

No, there is hardware for it, and it makes a big difference. Ballpark 2x, but it can be more or less depending on the details of the workload (ie shader complexity).

One way to get an empirical handle on this question is to write a rasterization pipeline entirely in software and run it in GPU compute. The classic Laine and Karras paper does exactly that:

https://research.nvidia.com/publication/high-performance-sof...

An intriguing thought experiment is to imagine a stripped-down, highly simplified GPU that is much more of a highly parallel CPU than a traditional graphics architecture. This is, to some extent, what Tim Sweeney was talking about (11 years ago now!) in his provocative talk "The end of the GPU roadmap". My personal sense is that such a thing would indeed be possible but would be a performance regression on the order of 2x, which would not fly in today's competitive world. But if one were trying to spin up a GPU effort from scratch (say, motivated by national independence more than cost/performance competitiveness), it would be an interesting place to start.

tedunangst · on July 31, 2020

Intel will ship larrabee any day now... :)

david-gpu · on July 31, 2020

The host interface has to be one of the simplest parts of the system, and I mean no disrespect to the fine engineers who work on that. Even the various internal task schedulers look more complex to me.

If you don't have insider's knowledge of how these things are made, I suggest using less certain language.

dahart · on July 31, 2020

> The most complex circuit on the GPU would be the thing chops the incoming command stream

That's not true at all. I'd recommend reading up on current architectures, or avoid making such wild assumptions.

dyingkneepad · on July 31, 2020

I completely disagree with this comment.

Just because a big part of the chip are the shading units it doesn't mean it's simple or there's no space for sophistication. Have even you been following the recent advancements in recent GPUs?

There is a lot of space for absolutely everything to improve. Especially now that Ray Tracing is a possibility and it uses the GPU in a very different way compared to old rasterization. Expect to see a whole lot of new instructions in the next years.

TomVDB · on July 31, 2020

> 90%+ of the core logic area (stuff that is not i/o, power, memory, or clock distribution) on the GPU are very basic matrix multipliers. >All best possible arithmetic circuits, multipliers, dividers, etc. are public knowledge.

Combine these 2 statements and most GPUs would have roughly identical performance characteristics (performance/Watt, performance/mm2, etc)

And yet, you see that both AMD and Nvidia GPUs (but especially the latter) have seen massive changes in architecture and performance.

As for the 90% number itself: look at any modern GPU die shot and you'll see that 40% is dedicated just to moving data in and out of the chip. Memory controllers, L2 caches, raster functions, geometry handling, crossbars, ...

And within the remaining 60%, there are large amounts of caches, texture units, instruction decoders etc.

The pure math portions, the ALUs, are but a small part of the whole thing.

I don't know enough about the very low level details of CPUs and GPUs to judge which ones are more complex, but in claiming that there's no space for sophistication, I can at least confidently say that I know much more than you.

david-gpu · on July 31, 2020

> GPUs are also much simpler chips in comparison to CPUs

Funny you say that. I've never heard a CPU architect coming to the GPU world and say "Gosh, how simple is this!".

I invite you to look at a GPU ISA and see for yourself, and that is only the visible programming interface.

baybal2 · on Aug 1, 2020

Judging by your nickname, I think I have reasons to listen. You are that David who writes GPU drivers?

So, what do you think is the most complex thing on an Nvidia GPU?

imtringued · on July 31, 2020

Matrix multipliers? As in those tensor cores that are only used by convolutional neural networks? Aren't you forgetting something? Like the entire rest of the GPU? You're looking at this from an extremely narrow machine learning focused point of view.

ISL · on July 31, 2020

Is that last sentence provable? If so, that's an impressively-strong statement (to state that the provably-most-efficient mathematical-computation circuit designs are known).

baybal2 · on July 31, 2020

Well, at least for reasonably big integers, it is.

There might be faster algos for super long integers, or minute implementation differences that add/subtract few kilogates.

TomVDB · on July 31, 2020

> Well, at least for reasonably big integers, it is.

Big integer calculations are the bread and butter of GPUs now?

staycoolboy · on July 31, 2020

RISC-V is in danger of imploding from its near-infinite flexibility.

It is driven largely by academics who lack a pragmatic drive in areas of time-to-market, and it is being explored by companies for profit motives only. NXP, NVIDIA, Western Digital see it as a way to cut costs due to Arm license fees.

RISC-V smells like Transmeta. I lived through that hype machine.

imtringued · on July 31, 2020

Transmeta as in a company that was strong armed by Intel in a court case Transmeta could have won if they hadn't run out of cash and time (and they ran out of those things because of that court case)?

staycoolboy · on Aug 1, 2020

Transmeta was never a viable solution: it was a pet project billed as a disrupter on the basis of it being a dynamic architecture. Do I need to explain the mountain of issues with their primary objectives or do you want to google that?

sigzero · on July 31, 2020

You make a HUGE assumption that is really going to happen.

theon144 · on July 31, 2020

Well sure, but... aren't the consequences of the possibility that it's not going to happen literally not worth discussing at all?

meh206 · on July 31, 2020

(Re: RISC-V) I hope so!!!

Symmetry · on July 31, 2020

I hear a lot of people talking about RISC-V but I wonder why people don't think companies would move to other open ISAs like Power or MIPS.

Teknoman117 · on July 31, 2020

"MIPS Open" came after the RISC-V announcement, and is still currently somewhat of a joke. Half the links on the MIPS Open site are dead.

I think one of the major points for RISC-V was to avoid the possibility of patent encumbrance of the ISA so that it can be freely used for educational purposes. My computer architecture courses 5-6 years ago used MIPS I heavily. MIPS was not open at the time, but any patents for the MIPS-I ISA had long since expired.

POWER is actually open, but it is tremendously more complicated. RISC-V by comparison feels like it borrows heavily from the early MIPS ISAs, just with a relaxation of the fixed sized instructions and no architectural delay slots and a commitment to an extensible ISA (MIPS had the coprocessor interface, but I digress).

The following is my own experience - while obviously high performance CPU cores are the product of intelligent multi-person teams and many resources, I believe RISC-V is simple enough that a college student or two could implement a compliant RV32I core in an HDL course if they knew anything about computer architecture. It wouldn't be a peak performance design by any measure (if it was they should be hired by a CPU company), but I think that's actually a point of RISC-V as an educational platform AND a platform for production CPU cores.

Symmetry · on July 31, 2020

As a teaching tool RISC-V is clearly great, as it is for companies that want to add custom instructions to their microcontrollers like NVidia or WD. But if I was looking to design a core to run user applications then to me it looks like everything is stacked in favor of Power. The complexity of the ISA is dwarfed by the complexity of a performant superscalar architecture. And to be performant in RISC-V you'd probably be needing extensive instruction fusion and variable length instructions anyways further equalizing things. And you really need the B extension which hasn't been standardized yet. Plus binary compatibility is a big concern on application cores and ISA extensions get in the way of that.

Teknoman117 · on July 31, 2020

I totally agree.

rrss · on Aug 1, 2020

> still currently somewhat of a joke. Half the links on the MIPS Open site are dead.

unsurprising given that Wave Computing shut down the project almost a year ago and subsequently declared bankruptcy.

Teknoman117 · on Aug 3, 2020

Oh I missed that, thanks for the info.

seldridge · on July 31, 2020

The MIPS Open ISA project was actually shut down after only a year. [^1]

POWER has technically only been "open" for a little under a year. OpenPOWER was always a thing, but this used to mean that companies could participate in the ISA development process and then pay license fees to build a chip. This changed last year when POWER went royalty-free (like RISC-V).

The real defniition of "open" is can you answer the following questions in the negative:

  - Do I need to pay someone for a license to implement the ISA?
  - Is the ISA patent-unencumbered?

RISC-V was the only game in town for a long time and thereby attracted large companies (including NVIDIA) and startups that were interested in building their own microprocessors, but didn't want to pay license fees or get sued into oblivion.

[^1]: https://www.hackster.io/news/wave-computing-closes-its-mips-...

gumby · on July 31, 2020

Because they aren’t as open in reality as they sounded when announced.

The advantages of those two (and of ARM) is that there actual Implementations with decades of development behind them, Yes, some technical debt but also many painfully-learned good decisions.

RISC V, which I’m really excited about, you can think of as a PRD (customer’s perspective product requirements document). That’s what an ISA is. Each team builds to meet that using their own iomplemetation, none of which is more than a few years old yet. But the teams incorporate decades of learning , and have largely a blank sheet to start with. I think it will be great....but isn’t yet.