HN2new | past | comments | ask | show | jobs | submitlogin

> - and 2 bytes per character stored

No, 2 bytes per code unit. A single UTF-16 character requires one or two code units. So a single "character" (code point in Unicode terminology) is either 2 or 4 bytes in Java. Additionally, a single Unicode character can require multiple code points.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: