HN2new | past | comments | ask | show | jobs | submitlogin

The granularity doesn't necessarily need to get down to the stroke level. In fact, if you start playing the video at around the 6 minute mark, you'll see that many of the characters are composed of the same elements, known as radicals (see: http://en.wikipedia.org/wiki/Radical_(Chinese_character) ).

Unicode does have the concept of "combining characters" in which a string of characters is used to compose one glyph (see: http://en.wikipedia.org/wiki/Combining_character ), but currently they generally are only used for adding diacritics. All Chinese characters in the current Unicode standard are precomposed, but it's potentially not out of the question to encode them as a composite of two or more sub-characters. The downside to this is that each character would end up taking several more bytes to encode, but one advantage is that novel characters could be created by combining two or more existing characters, which currently cannot be done without explicitly adding the novel character to the Unicode standard (see: http://en.wikipedia.org/wiki/Precomposed_character#Chinese_c... ).

The input methods that fedd mentions are just input methods, which translate what the user inputs into encodings of precomposed characters. This is different than having the encoding themselves represent the composition of the characters.



This here explains why this has not been done: https://hackernews.hn/item?id=2435708




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: