Unicode was created with the idea of making all characters used in the world available in a common character set, and is used in Unix, Windows, macOS, Plan 9, Java, etc. It includes not only modern characters but also ancient characters, historical characters, mathematical symbols, and emojis.

Interoperability with character codes before Unicode is also taken into consideration to a certain extent, and when historical or practical identification is required, a compatibility area is reserved, and some characters are designed to return to their original form when converted from the original code to Unicode and back to the original code (round-trip conversion). However, while there are few problems within the range of the official JIS X 0208, garbled characters can occur when multiple character sets are mixed, or when there are differences in correspondence, such as CP932, which is the actual state of Shift_JIS, and CP51932 and eucJP-MS, which are variants of EUC-JP.

Encoding examples for each character encoding format

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
UTF-8 A Ω 😊
41 CE A9 E8 AA 9E F0 9F 98 8A
UTF-16 A Ω 😊
0041 03A9 8A9E D83D DE0A
UTF-32 A Ω 😊
00000041 000003A9 00008A9E 0001F60A