Computers represent letters and characters as numbers. Historically, conventions mapping characters to numbers have been limited to certain writing systems or computing platforms. Encoding systems often conflicted: the same character could be represented by different numbers in different systems and one number could mean any wild character.
Unicode is an attempt to unify these encoding systems. Every character of any writing system currently in use (Unicode includes only a limited number of historical scripts) is assigned exactly one number, no matter the operating system or locale.
A character, in Unicode, is independent of how it appears on screen or in print. Many languages contain characters that have a different shape at the end of a word, for example. In Unicode, both variants are the same character.
How Many Characters Does Unicode Hold?
Unicode can support up to 1,114,111 characters. Various methods can be used to represent these, most commonly UTF-8, UTF-16 and UTF-32. They differ slightly in the way characters are translated to numbers, but each supports all Unicode characters. Converting between them is easy.
The current Unicode 5.1 standard knows 100,713 characters.
Unicode and Email
Emails using Unicode are typically sent using UTF-8 or UTF-7. The latter is yet another representation of Unicode characters specifically designed for email. UTF-7 is not in widespread use but for email has the advantage that it can be sent through old mail servers without complications. UTF-8 often has to be encoded using Base64 or quoted-printable.