Deprecated Characters

Once a Unicode character has been officially assigned, it will remain part of the standard for all eternity and some of its most critical properties, particularly those related to normalisation, are set in stone (cf. Unicode Character Encoding Stability Policies). If a character is found to be seriously defective in a way that cannot be fixed by updated properties, algorithms or documentation, the only possible course of action is to deprecate it.

The use of characters that possess the Deprecated property is strongly discouraged. Sometimes there exist alternative characters or character sequences without problematic properties that can be used in their stead; sometimes a character is so fundamentally broken that the entire concept should be eschewed altogether. Deprecated characters remain valid code points and their exchange is not prohibited. It might even be required to support them for compatibility with certain systems.

Curiously, Unicode employs a two‐tier system for deprecation. Several characters are marked as defective and/or discouraged throughout the standard and its associated charts and data files without actually being formally deprecated in the Unicode Character Database. These characters are also listed on this page for completeness.

Sources: PropList.txt, NamesList.txt

Currently Deprecated

These characters possess the Deprecated property in the most recent version of the Unicode Standard.

Character Added Deprecated Preferred Representation Notes
0149 ʼn Latin Small Letter N Preceded by Apostrophe 1.1 5.2 2019 006E ’n

Compatibility decomposition mapping uses U+02BC ʼ Modifier Letter Apostrophe as the apostrophe, but U+2019 Right Single Quotation Mark is now the preferred character to represent that punctuation mark. Originally encoded for compatibility with ISO/IEC 6937.

Source: L2/08‐287

0673 ٳ Arabic Letter Alef with Wavy Hamza Below 1.1 6.0 0627 065F اٟ

U+065F ◌ٟ Arabic Wavy Hamza Below was encoded after it was discovered that alef is not the only letter making use of this diacritic.

Source: L2/09‐176

0F77 Tibetan Vowel Sign Vocalic Rr 2.0 5.2 0FB2 0F71 0F80 ྲཱྀ

Compatibility equivalents whose decomposition mappings include U+0F81 ◌ཱྀ Tibetan Vowel Sign Reversed Ii, which itself is weakly deprecated and can never appear in normalised text.

Source: L2/01‐301

0F79 Tibetan Vowel Sign Vocalic Ll 2.0 5.2 0FB3 0F71 0F80 ླཱྀ
17A3 Khmer Independent Vowel Qaq 3.0 4.0 17A2

Duplicate phantom characters originally intended for transliterating Pali and Sanskrit.

Source: L2/02‐097

17A4 Khmer Independent Vowel Qaa 3.0 5.2 17A2 17B6 អា
206A Inhibit Symmetric Swapping 1.1 3.2

Originally intended for a stateful mechanism inherited from ISO/IEC 10646 to toggle bidirectional mirroring. On by default.

Source: L2/01‐301

206B Activate Symmetric Swapping 1.1 3.2
206C Inhibit Arabic Form Shaping 1.1 3.2

Originally intended for a stateful mechanism inherited from ISO/IEC 10646 to toggle Arabic presentation forms participating in cursive joining behaviour like regular letters do. Off by default.

Source: L2/01‐301

206D Activate Arabic Form Shaping 1.1 3.2
206E National Digit Shapes 1.1 3.2

Originally intended for a stateful mechanism inherited from ISO/IEC 10646 to toggle ASCII digits U+0030 through U+0039 being displayed with alternate, script‐specific glyphs instead of their normal appearance. Off (nominal shapes) by default.

Source: L2/01‐301

206F Nominal Digit Shapes 1.1 3.2
2329 Left-Pointing Angle Bracket 1.1 5.2 27E8

U+2329 and U+232A are canonically equivalent to U+3008 Left Angle Bracket and U+3009 Right Angle Bracket respectively – CJK characters with properties unsuited to non‐ideographic contexts.

Source: L2/01‐317

232A Right-Pointing Angle Bracket 1.1 5.2 27E9
E0001 󠀁 Language Tag 3.1 5.1

A mechanism for tagging language information in plain text was introduced to prevent the disruptive Internet Draft Multi‐Lingual String Format (MLSF) from progressing and then quickly abandoned because of severe architectural issues. In Unicode 15.0, said mechanism was completely removed from the core standard.

Sources: RFC 6082, email from Doug Ewell (2003‑10‑21), email from Martin J. Dürst (2019‑01‑29)

Character Added Deprecated Preferred Representation Notes

Formerly Deprecated

These characters had previously possessed the Deprecated property, but that decision has since been reverted. Use of some of these characters may still be discouraged, and some may be canonically equivalent to their preferred representation.

Character Added Deprecated Undeprecated Preferred Representation Notes
0340 ̀ Combining Grave Tone Mark 1.1 3.2 5.2 0300 ̀

Canonical equivalents. Originally intended for Vietnamese.

Source: Combining Diacritical Marks chart

0341 ́ Combining Acute Tone Mark 1.1 3.2 5.2 0301 ́
17D3 Khmer Sign Bathamasat 3.0 4.0 5.2

A full set of preassembled lunar date symbols now exists in the Khmer Symbols block. U+17D3 therefore no longer serves a purpose.

Source: Khmer chart

E0020 󠀠 Tag Space 3.1 5.1 8.0

The language tagging mechanism that these characters were originally encoded for is no longer part of the standard, but the tags were later repurposed for emoji sequences representing regional flags. Only tags corresponding to small letters and digits, as well as U+E007F are currently in use.

See Unicode Technical Standard #51: Unicode Emoji for more information.

Sources: The Unicode Standard, section 23.9: Tag Characters, Tags chart

E0021 󠀡 Tag Exclamation Mark 3.1 5.1 8.0
E0022 󠀢 Tag Quotation Mark 3.1 5.1 8.0
E0023 󠀣 Tag Number Sign 3.1 5.1 8.0
E0024 󠀤 Tag Dollar Sign 3.1 5.1 8.0
E0025 󠀥 Tag Percent Sign 3.1 5.1 8.0
E0026 󠀦 Tag Ampersand 3.1 5.1 8.0
E0027 󠀧 Tag Apostrophe 3.1 5.1 8.0
E0028 󠀨 Tag Left Parenthesis 3.1 5.1 8.0
E0029 󠀩 Tag Right Parenthesis 3.1 5.1 8.0
E002A 󠀪 Tag Asterisk 3.1 5.1 8.0
E002B 󠀫 Tag Plus Sign 3.1 5.1 8.0
E002C 󠀬 Tag Comma 3.1 5.1 8.0
E002D 󠀭 Tag Hyphen-Minus 3.1 5.1 8.0
E002E 󠀮 Tag Full Stop 3.1 5.1 8.0
E002F 󠀯 Tag Solidus 3.1 5.1 8.0
E0030 󠀰 Tag Digit Zero 3.1 5.1 8.0
E0031 󠀱 Tag Digit One 3.1 5.1 8.0
E0032 󠀲 Tag Digit Two 3.1 5.1 8.0
E0033 󠀳 Tag Digit Three 3.1 5.1 8.0
E0034 󠀴 Tag Digit Four 3.1 5.1 8.0
E0035 󠀵 Tag Digit Five 3.1 5.1 8.0
E0036 󠀶 Tag Digit Six 3.1 5.1 8.0
E0037 󠀷 Tag Digit Seven 3.1 5.1 8.0
E0038 󠀸 Tag Digit Eight 3.1 5.1 8.0
E0039 󠀹 Tag Digit Nine 3.1 5.1 8.0
E003A 󠀺 Tag Colon 3.1 5.1 8.0
E003B 󠀻 Tag Semicolon 3.1 5.1 8.0
E003C 󠀼 Tag Less-Than Sign 3.1 5.1 8.0
E003D 󠀽 Tag Equals Sign 3.1 5.1 8.0
E003E 󠀾 Tag Greater-Than Sign 3.1 5.1 8.0
E003F 󠀿 Tag Question Mark 3.1 5.1 8.0
E0040 󠁀 Tag Commercial At 3.1 5.1 8.0
E0041 󠁁 Tag Latin Capital Letter A 3.1 5.1 8.0
E0042 󠁂 Tag Latin Capital Letter B 3.1 5.1 8.0
E0043 󠁃 Tag Latin Capital Letter C 3.1 5.1 8.0
E0044 󠁄 Tag Latin Capital Letter D 3.1 5.1 8.0
E0045 󠁅 Tag Latin Capital Letter E 3.1 5.1 8.0
E0046 󠁆 Tag Latin Capital Letter F 3.1 5.1 8.0
E0047 󠁇 Tag Latin Capital Letter G 3.1 5.1 8.0
E0048 󠁈 Tag Latin Capital Letter H 3.1 5.1 8.0
E0049 󠁉 Tag Latin Capital Letter I 3.1 5.1 8.0
E004A 󠁊 Tag Latin Capital Letter J 3.1 5.1 8.0
E004B 󠁋 Tag Latin Capital Letter K 3.1 5.1 8.0
E004C 󠁌 Tag Latin Capital Letter L 3.1 5.1 8.0
E004D 󠁍 Tag Latin Capital Letter M 3.1 5.1 8.0
E004E 󠁎 Tag Latin Capital Letter N 3.1 5.1 8.0
E004F 󠁏 Tag Latin Capital Letter O 3.1 5.1 8.0
E0050 󠁐 Tag Latin Capital Letter P 3.1 5.1 8.0
E0051 󠁑 Tag Latin Capital Letter Q 3.1 5.1 8.0
E0052 󠁒 Tag Latin Capital Letter R 3.1 5.1 8.0
E0053 󠁓 Tag Latin Capital Letter S 3.1 5.1 8.0
E0054 󠁔 Tag Latin Capital Letter T 3.1 5.1 8.0
E0055 󠁕 Tag Latin Capital Letter U 3.1 5.1 8.0
E0056 󠁖 Tag Latin Capital Letter V 3.1 5.1 8.0
E0057 󠁗 Tag Latin Capital Letter W 3.1 5.1 8.0
E0058 󠁘 Tag Latin Capital Letter X 3.1 5.1 8.0
E0059 󠁙 Tag Latin Capital Letter Y 3.1 5.1 8.0
E005A 󠁚 Tag Latin Capital Letter Z 3.1 5.1 8.0
E005B 󠁛 Tag Left Square Bracket 3.1 5.1 8.0
E005C 󠁜 Tag Reverse Solidus 3.1 5.1 8.0
E005D 󠁝 Tag Right Square Bracket 3.1 5.1 8.0
E005E 󠁞 Tag Circumflex Accent 3.1 5.1 8.0
E005F 󠁟 Tag Low Line 3.1 5.1 8.0
E0060 󠁠 Tag Grave Accent 3.1 5.1 8.0
E0061 󠁡 Tag Latin Small Letter A 3.1 5.1 8.0
E0062 󠁢 Tag Latin Small Letter B 3.1 5.1 8.0
E0063 󠁣 Tag Latin Small Letter C 3.1 5.1 8.0
E0064 󠁤 Tag Latin Small Letter D 3.1 5.1 8.0
E0065 󠁥 Tag Latin Small Letter E 3.1 5.1 8.0
E0066 󠁦 Tag Latin Small Letter F 3.1 5.1 8.0
E0067 󠁧 Tag Latin Small Letter G 3.1 5.1 8.0
E0068 󠁨 Tag Latin Small Letter H 3.1 5.1 8.0
E0069 󠁩 Tag Latin Small Letter I 3.1 5.1 8.0
E006A 󠁪 Tag Latin Small Letter J 3.1 5.1 8.0
E006B 󠁫 Tag Latin Small Letter K 3.1 5.1 8.0
E006C 󠁬 Tag Latin Small Letter L 3.1 5.1 8.0
E006D 󠁭 Tag Latin Small Letter M 3.1 5.1 8.0
E006E 󠁮 Tag Latin Small Letter N 3.1 5.1 8.0
E006F 󠁯 Tag Latin Small Letter O 3.1 5.1 8.0
E0070 󠁰 Tag Latin Small Letter P 3.1 5.1 8.0
E0071 󠁱 Tag Latin Small Letter Q 3.1 5.1 8.0
E0072 󠁲 Tag Latin Small Letter R 3.1 5.1 8.0
E0073 󠁳 Tag Latin Small Letter S 3.1 5.1 8.0
E0074 󠁴 Tag Latin Small Letter T 3.1 5.1 8.0
E0075 󠁵 Tag Latin Small Letter U 3.1 5.1 8.0
E0076 󠁶 Tag Latin Small Letter V 3.1 5.1 8.0
E0077 󠁷 Tag Latin Small Letter W 3.1 5.1 8.0
E0078 󠁸 Tag Latin Small Letter X 3.1 5.1 8.0
E0079 󠁹 Tag Latin Small Letter Y 3.1 5.1 8.0
E007A 󠁺 Tag Latin Small Letter Z 3.1 5.1 8.0
E007B 󠁻 Tag Left Curly Bracket 3.1 5.1 8.0
E007C 󠁼 Tag Vertical Line 3.1 5.1 8.0
E007D 󠁽 Tag Right Curly Bracket 3.1 5.1 8.0
E007E 󠁾 Tag Tilde 3.1 5.1 8.0
E007F 󠁿 Cancel Tag 3.1 5.1 9.0
Character Added Deprecated Undeprecated Preferred Representation Notes

Unofficially Deprecated

It is recommended that these characters not be used because of various deficiencies as noted in the names list, but they do not possess the Deprecated property and never did. Some may be canonically equivalent to their preferred representation.

Character Added Preferred Representation Notes
0344 ̈́ Combining Greek Dialytika Tonos 1.1 0308 0301 ̈́

Canonical equivalents.

Sources: Combining Diacritical Marks chart, Greek and Coptic chart

037E ; Greek Question Mark 1.1 003B ;
0387 · Greek Ano Teleia 1.1 00B7 ·
0478 Ѹ Cyrillic Capital Letter Uk 1.1 041E 0443 Оу

Preferred representation depends on whether the “digraph onik” or “monograph uk” form is desired.

Sources: Cyrillic chart, L2/15‐014

A64A
0479 ѹ Cyrillic Small Letter Uk 1.1 043E 0443 оу
A64B
0675 ٵ Arabic Letter High Hamza Alef 1.1 0674 0627 ٴا

Compatibility decompositions are defective, placing U+0674 ٴ Arabic Letter High Hamza after the respective base letter when it should come before.

Sources: The Unicode Standard, section 9.2.5: Combining Hamza, Arabic chart

0676 ٶ Arabic Letter High Hamza Waw 1.1 0674 0648 ٴو
0677 ٷ Arabic Letter U with Hamza Above 1.1 0674 06C7 ٴۇ
0678 ٸ Arabic Letter High Hamza Yeh 1.1 0674 0649 ٴى
06E1 ۡ Arabic Small High Dotless Head of Khah 1.1 0652 ْ

A font variant.

Source: Arabic chart

0953 Devanagari Grave Accent 1.1 0300 ̀

Never intended for Devanagari but actually for Latin where they are redundant.

Source: Devanagari chart

0954 Devanagari Acute Accent 1.1 0301 ́
0AF1 Gujarati Rupee Sign 4.0 0AB0 0AC2 0AF0 રૂ૰

U+0AF0 Gujarati Abbreviation Sign hadn’t been encoded yet when U+0AF1 was introduced.

Sources: Gujarati chart, L2/09‐330

0F73 Tibetan Vowel Sign Ii 2.0 0F71 0F72 ཱི

Canonical equivalents.

Source: Tibetan chart

0F75 Tibetan Vowel Sign Uu 2.0 0F71 0F74 ཱུ
0F81 Tibetan Vowel Sign Reversed Ii 2.0 0F71 0F80 ཱྀ
17A8 Khmer Independent Vowel Quk 3.0 17A7 1780 ឧក

An obsolete ligature.

Source: Khmer chart

17D8 Khmer Sign Beyyal 3.0 17D4 179B 17D4 ។ល។

Other abbreviations with the same meaning of “et cetera” exist.

Source: Khmer chart

20A4 Lira Sign 1.1 00A3 £

Not widely used.

Source: Currency Symbols chart

2126 Ohm Sign 1.1 03A9 Ω

Canonical equivalents.

Source: Letterlike Symbols chart

212B Angstrom Sign 1.1 00C5 Å
2DF5 Combining Cyrillic Letter Es-Te 5.1 2DED 2DEE ⷭⷮ

Combining Cyrillic letters when used in sequence are generally intended to stack horizontally rather than vertically, making U+2DF5 redundant.

Sources: Cyrillic Extended‑A chart, L2/15‐002, L2/15‐014

301E Double Prime Quotation Mark 1.1 301F

Originally intended as a “double prime” version of U+201D Right Double Quotation Mark, but U+301F Low Double Prime Quotation Mark already fulfills that function.

Source: CJK Symbols and Punctuation chart

111C4 𑇄 Sharada Om 6.1 1118F 11180 𑆏𑆀

Concerns about U+111C4 were raised too late into the development process of Unicode 6.1.0 to remove it from the repertoire. Usage is discouraged until evidence for a Sharada om with an appearance distinct from the sequence <U+1118F, U+11180> is found.

Sources: Sharada chart, L2/11‐308, L2/12‐019

130FA 𓃺 Egyptian Hieroglyph E034a 5.2 130F9 𓃹

Stylistic variants.

Source: Egyptian Hieroglyphs chart

1310C 𓄌 Egyptian Hieroglyph F013a 5.2 1310B 𓄋
13169 𓅩 Egyptian Hieroglyph G036a 5.2 13168 𓅨
1316B 𓅫 Egyptian Hieroglyph G037a 5.2 1316A 𓅪
1320A 𓈊 Egyptian Hieroglyph N025a 5.2 13209 𓈉
13215 𓈕 Egyptian Hieroglyph N034a 5.2 13214 𓈔
133A0 𓎠 Egyptian Hieroglyph V030a 5.2 1339F 𓎟
133B2 𓎲 Egyptian Hieroglyph W003a 5.2 133B1 𓎱
Character Added Preferred Representation Notes