HN2new | past | comments | ask | show | jobs | submitlogin

If I'm understanding correctly, this is collecting LBR data through hardware support for PGO/AutoFDO, right?


(These are older comments that we merged from https://hackernews.hn/item?id=42888185, in case anyone was confused by the timestamps)


Yes. Although we are studying CSSPO, which uses a mixed (LBR + software-sampled stacks) approach.


I'm familiar with the paper, but it doesn't improve the situation in terms of LBR availability on cloud providers, does it?


Yes, existing limitations apply. Without hardware LBR support, we cannot provide sPGO profiles. However, the basic profiling should work fine.


Blog is packed with information, thanks!

Isn't it the case that from stack traces it is rather impossible to read that function foo() is burning CPU cycles because it is memory-bound? And the reason could be rather somewhere else and not in that particular function - e.g. multiple other threads creating contention on the memory bus?

If so, doesn't this make the profile somewhat an invalid candidate for PGO?


It depends on the event that was sampled to generate the profiles. For example, if you sample instructions by collecting a stack trace every N instructions, you won't actually see foo() burning the CPU. However, if you look at CPU cycles, foo() will be very noticeable. Internally, we use sPGO profiles from sampling CPU cycles, not instructions.


Right, perhaps I was a little bit too vague but what I was trying to say is that by merely sampling the CPU cycles we cannot infer that the foo() was burning CPU because it was memory-bound and which in itself is not an artifact of foo() implementation but rather application-wide threads that happen to saturate the memory bus more quickly.

Or is my doubt incorrect?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: