HN2new | past | comments | ask | show | jobs | submitlogin

Pedantic quip: I have a hard time believing you guys were measuring half nanosecond cache latencies on a machine with a 100MHz clock. :)

And actually the cache numbers seem optimistic, if anything. My memory is that a L1 cache hit on SNB is 5 cycles, which is 2-3x as long as that table shows.



We didn't believe it either until we put a logic analyzer on the bus and found that the numbers were spot with respect to the number of cycles. I don't remember how far off they were but it wasn't much, all the hardware dudes were amazed that software could get that close.

tl;dr: the numbers were accurate to the # of cycles, might have been as much as 1/2 of 1 cycle off.

Edit: I should add this was almost 20 years ago, I dunno how well it works today. Sec, lemme go test on a local machine.

OK, I ran on a Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz (I think that's a Sandy Bridge) that is overclocked to 4289 MHz according to mhz, and it looks to me like that machine takes 4 cycles to do a L1 load. That sound right? lmbench says 4.05 cycles.

I poked a little more and I get

L1 4 cycles, ~48K L2 12 cycles, ~256K L3 16 cycles, ~6M

Off to google and see how far off I am. Whoops, work is calling, will check back later.


You realize he is measuring averages right? Setup 1 million L1 cache hits, divide by total time...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: