you don't even have to go to non english, just utf-8 has stuff like mathematic spaces and spaces of different em sizes. http://perldoc.perl.org/perlrecharclass.html has a reasonably good list of non ascii spaces, though I'm not sure where to dig through for locale specific lists.
That said, while it may not be a solution for every case, it's a solution for the common case and a starting point for other cases, and thus pretty nifty and potentially useful.
This is a good instinct, but doesn't work well in practice unless the prevalence of spaces is very low. And if it's very low, you only need the largest check followed by a single fallback. The problem is that mispredicted branches (for example, 'if statements' that require different code paths without a clear pattern) are relatively expensive.
The scalar operations of reading a character, checking whether it's a space (0x20), and writing it to an output can often be done in a single cycle (the processor is 'superscalar'). A mispredicted branch costs about 15 cycles. Thus for simple tasks like this, if the average distance between spaces is 16 or less, you are likely better off with the simpler straightforward approach.
The average (english) word is ~5 characters long, so most of the time, you'd be forced to check anyway.