HN2new | past | comments | ask | show | jobs | submitlogin

You are so right - everything depends on the dataset. Now after all, just skipping 4 characters (1 byte, 32 bit arch) actually seems VERY reasonable.

Maybe do it log> 1) check if you can jump 16, 2) check if you can jump 8, 3) check if you can jump 4, execute



This is a good instinct, but doesn't work well in practice unless the prevalence of spaces is very low. And if it's very low, you only need the largest check followed by a single fallback. The problem is that mispredicted branches (for example, 'if statements' that require different code paths without a clear pattern) are relatively expensive.

The scalar operations of reading a character, checking whether it's a space (0x20), and writing it to an output can often be done in a single cycle (the processor is 'superscalar'). A mispredicted branch costs about 15 cycles. Thus for simple tasks like this, if the average distance between spaces is 16 or less, you are likely better off with the simpler straightforward approach.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: