It can speculate whether an optimization is performant. Not whether it is sound....

pron · 2026-02-01T16:50:24 1769964624

> Not whether it is sound.

Precisely right, but the entire point is that it doesn't need to. The optimisation is applied in such a way that when it is wrong, a signal triggers, at which point the method is "deoptimised".

That is why Java can and does aggressively optimise things that are hard for compilers to prove. If it turns out to be wrong, the method is then deoptimised.

galangalalgol · 2026-02-01T17:21:24 1769966484

But how can it know the optimization violated aliasing or rounding order or any number of usually silent ub?

pron · 2026-02-01T23:58:38 1769990318

There's no aliasing in the messy C sense in Java (and no pointers into the middle of objects at all). As for other optimisations, there are traps inserted to detect violation if speculation is used at all, but the main thrust of optimisation is quite simple:

The main optimisation is inlining, which, by default, is done to the depth of 15 (non-trivial) calls, even when they are virtual, i.e. dispatched dynamically, and that's the main speculation - that a specific callsite calls a specific target. Then you get a large inlined context within which you can perform optimisations that aren't speculative (but proven).

If you've seen Andrew Kelley's talk about "the vtable boundary"[1] and how it makes efficient abstraction difficult, that boundary does not exist in Java because compilation is at runtime and so the compiler can see through vtables.

But it's also important to remember that low-level languages and Java aim for different things when they say "performance". Low-level languages aim for the worst-case. I.e., some things may be slower than others (e.g. dynamic vs. static dispatch) but when you can use the faster construct, you are guaranteed a certain optimisation. Java aims to optimise something that's more like the "average case" performance, i.e. when you write a program with all the most natural and general construct, it will, be the fastest for that level of effort. You're not guaranteed certain optimisations, but you're not penalised for a more natural, easier-to-evolve, code either.

The worst-case model can get you good performance when you first write the program. But over time, as the program evolves and features are added, things usually get more general, and low level languages do have an "abstraction penalty", so performance degrades, which is costly, until at some point you may need to rearchitect everything, which is also costly.

[1]: https://youtu.be/f30PceqQWko

galangalalgol · 2026-02-03T11:53:27 1770119607

I mostly do dsp and control software, so number heavy. I am excited at the prospect of anything that might get me a performance boost. I tried porting a few smaller tests to java and got it to c2 some stuff, but I couldn't get it to autovectorize anything without making massive (and unintuitive) changes to the data structures. So it was still roughly 3x slower than the original in rust. I'll be trying it again though when Valhalla hits, so thanks for the heads up.

pron · 2026-02-04T12:05:06 1770206706

You can use the Vector API (https://openjdk.org/jeps/529) for manual vectorisation.

Although there's no doubt that the lack of flattened object is the last remaining real performance issue for Java vs. a lower-level language, and it specifically impacts programs of the kind you're writing. Valhalla will take care of that.