Short circuiting comparisons is still linear for integers of unbounded size. But besides that, it's incredibly common to assume that 64-bit primitives are done in constant time, because to the asymptotic analysis, what matters is that the integers are _bounded_ in size. As long as it's bounded, the asymptotic analysis here doesn't care -- you can pick any size you want, as long as it's bounded.
One thing to keep in mind here (which it seems you are maybe confused about) is that these bounds are _asymptotics_ of the input size generally, and have very little to say about any particular choice of input size. 64-bit, 32-bit, whatever, it doesn't matter, if the size of the operands is bounded, the they're the same thing to the asymptotic analysis. As long as they're bounded, you can say that the operations on those operands is proportional to a constant times the (bounded) size of the operands.
The consequence of this is that when you deal with operands of _unbounded_ size, all of this asymptotic analysis goes right out the window. You can't possibly have a meaningful comparison operator, for example, that operates in constant time on unbounded integers.
The discussion is about big O, yes? This isn't really relevant to our discussion. Parallel analysis and average case, for the purposes of the discussion here, are fancy games you can try to play to get nice bounds, but they do not really have any impact on the big O of these algorithms.
"You can't possibly have a meaningful comparison operator, for example, that operates in constant time on unbounded integers."
We don't really deal with unbound integers worst case is something like 2^(2^1,000,000,000,000,000,000)) before we can't actually store them.
Even then log of log of N makes random numbers of that size practically constant time to compare. Put another way there is less than 1 in 2^64 you need to do more than 3 comparisons and less than one in 2^128 you need more than 4. One for length, one for the chunk which might be just one bit and one more for the next 64 bits. Sure that might happen, but reolistically random input is unlikely to need many comparisons.
Do you realize that the whole point of asymptotic analysis is that _we do not care_ what integers we "really" deal with? These bounds specifically deal with sorting arbitrary data of unbounded size, and _no other assumptions_.
Do you understand? No assumption that we can use parallelism. No games with integers you see in "real life". None of that is relevant. You can't play games with practicalities to get a better bound. This is the general bound, and if you fiddle with it to get something else, you are changing the scope of the question to be something else entirely.
If you want to have a discussion about those other models, fine, but that's not the discussion we're having.
What your talking about is a type of overly simplified model that's irrelevant for actual computing. My point is there are other models out there. If you actually want the fastest algorithm out there you need to design it for actual hardware with finite data, cache, limited registers, limited ram, and sometime aproximating a real instruction set etc.
However, by changing your assumptions you can still use O notation with more complex models.
Short circuiting comparisons is still linear for integers of unbounded size. But besides that, it's incredibly common to assume that 64-bit primitives are done in constant time, because to the asymptotic analysis, what matters is that the integers are _bounded_ in size. As long as it's bounded, the asymptotic analysis here doesn't care -- you can pick any size you want, as long as it's bounded.
One thing to keep in mind here (which it seems you are maybe confused about) is that these bounds are _asymptotics_ of the input size generally, and have very little to say about any particular choice of input size. 64-bit, 32-bit, whatever, it doesn't matter, if the size of the operands is bounded, the they're the same thing to the asymptotic analysis. As long as they're bounded, you can say that the operations on those operands is proportional to a constant times the (bounded) size of the operands.
The consequence of this is that when you deal with operands of _unbounded_ size, all of this asymptotic analysis goes right out the window. You can't possibly have a meaningful comparison operator, for example, that operates in constant time on unbounded integers.