Unicode

Character encoding standard

Follow Unicode on Notably News to receive short updates to your email — rarely!

We include updates on International Phonetic Alphabet, Optical character recognition, Box-drawing characters, Non-breaking space, Astrological symbols, List of XML and HTML character entity references, Byte order mark, Zero-width space, Regional indicator symbol, Zalgo text, Ghost characters, Combining character, Universal Character Set characters, Soft hyphen, CJK Unified Ideographs, Universal Coded Character Set ... and more.

2025
DIN 91379
OpenPDF project noted as currently not active, with OpenPDFSaucer maintaining a fork of the original project.
2024
Universal Coded Character Set
UTF-32 encoding permits a binary representation of all existing code points in APIs and software applications as of this year.
November 1 2024
DIN 91379
Compliance with DIN 91379 becomes mandatory for German authorities and organizations when exchanging data with citizens and businesses.
September 2024
Box-drawing characters
Unicode version 16.0 was released, extending the Unicode standard with a new block called Symbols for Legacy Computing Supplement, which includes additional box-drawing characters and symbols from obsolete operating systems primarily from the 1970s and 1980s.
September 2024
Universal Character Set characters
Unicode version released, with 299,056 (27%) code points allocated, 155,063 (14%) characters assigned, and 137,468 (12%) reserved for private use.
2023
CJK Unified Ideographs
China's Ministry of Public Security added 622 characters to the CJK Unified Ideographs Extension I block in the Unicode Supplementary Ideographic Plane, spanning the range U+2EBF0 through U+2EE5F.
2022
CJK Unified Ideographs
Unicode 15.0 added 1 character to CJK Unified Ideographs Extension C and introduced Extension H in the Tertiary Ideographic Plane (TIP) with 4,192 characters, bringing the total to 97,058 characters.
August 2022
DIN 91379
The final DIN standard for DIN 91379 was formally established, completing the initial standardization process.
July 2022
DIN 91379
German federal IT architecture guideline demands the usage of the predecessor standard DIN SPEC 91379.
June 2022
DIN 91379
Revision of DIN 2137-1 keyboard layouts E1 and E2 was completed to enable entry of all characters in DIN 91379 (except Cyrillic letters) without using Unicode values or decimal codes.
May 2022
DIN 91379
DIN 5009:2022-06 standard was published, providing German-language names, spelling rules, and announcement words for characters in DIN 91379.
January 2022
Hearts in Unicode
Middle Eastern news publications reported that sending a Red Heart emoji on WhatsApp in Saudi Arabia could be considered harassment, potentially leading to a maximum two-year jail sentence.
2021
CJK Unified Ideographs
Unicode 14.0 added 3 CJK Unified Ideographs to the BMP, 2 to Extension B, and 4 to Extension C.
December 14 2021
Bidirectional text
Visual Studio version 17.0.3 was released, implementing highlighting of Unicode bidirectional control characters to mitigate the Trojan Source vulnerability.
October 2021
Bidirectional text
Visual Studio Code version 1.62 was released, introducing highlighting of Unicode bidirectional (BiDi) control characters to address potential security vulnerabilities.
2020
IDN homograph attack
Microsoft Edge adopts the Chromium-based browser approach to handling IDN homograph attacks.
2020
Universal Coded Character Set
ISO/IEC 10646:2020 was published, providing the base for Unicode 13.0 and subsequent versions, demonstrating the continuous evolution of the Universal Coded Character Set.
2020
CJK Unified Ideographs
Unicode 13.0 significantly expanded CJK character sets by adding 13 CJK Unified Ideographs to the BMP, 10 to CJK Unified Ideographs Extension A, 7 to Extension B, and introduced Extension G in the Tertiary Ideographic Plane (TIP) with 4,939 characters.
March 2020
Plane
Unicode 13.0 was released, adding CJK Unified Ideographs Extension G to the Tertiary Ideographic Plane (TIP)
January 20 2020
XeTeX
The last known change to the XeTeX source code was made, with no further development occurring after this date.
2019
IDN homograph attack
Chiba et al. developed DomainScouter, a system capable of detecting diverse IDN homographs by analyzing approximately 4.4 million registered IDNs across 570 Top-Level Domains, successfully identifying 8,284 previously undetected IDN homographs targeting brands in multiple languages.
2019
IDN homograph attack
Suzuki et al. introduced ShamFinder, a research program designed to recognize Internationalized Domain Name (IDN) homographs, providing insights into their real-world prevalence.
2019
Hearts in Unicode
The Unicode Consortium ranked the Heart Eyes emoji (😍) as the third most used emoji, behind the Red Heart and Face with Tears of Joy emoji.
March 2019
DIN 91379
DIN SPEC 91379 was specified, marking a significant milestone in the standardization process for character encoding and representation.
2018
CJK Unified Ideographs
Unicode 11.0 added 5 CJK Unified Ideographs to the BMP, bringing the total to 87,887 characters.
December 2018
DIN 91379
Previous version of DIN 2137-1 keyboard layout standard was released, which was later revised to include support for characters in DIN 91379.
2017
IDN homograph attack
Chrome tightens IDN restrictions in version 59 to prevent spoofing attacks, specifically addressing issues with Cyrillic character domain names.
2017
Universal Coded Character Set
ISO/IEC 10646:2017 was released, corresponding to Unicode 10.0, continuing the ongoing expansion of character sets.
2017
CJK Unified Ideographs
Unicode 10.0 added CJK Unified Ideographs Extension F in the SIP with 7,473 characters and 21 additional characters, increasing the total to 87,882 characters.
2016
IDN homograph attack
Google Chrome version 51 implements an algorithm to handle IDN display, similar to Firefox's approach of managing potential homograph attacks.
2015
CJK Unified Ideographs
Unicode 8.0 introduced CJK Unified Ideographs Extension E in the SIP with 5,762 characters and 9 additional characters, expanding the total to 80,388 characters.
2014
Universal Coded Character Set
ISO/IEC 10646:2014 was published, setting the groundwork for future Unicode versions with new character inclusions.
2012
IDN homograph attack
Mozilla Firefox version 22 introduces a new approach to displaying IDNs, showing them only if the TLD prevents homograph attacks or labels do not mix scripts for different languages.
2012
CJK Unified Ideographs
Unicode 6.1 added 1 character corresponding to Adobe-Japan1-6 CID+20156, increasing the total to 74,617 characters.
2011
Universal Coded Character Set
ISO/IEC 10646:2011 was released, corresponding to Unicode 6.0, with various character additions and amendments.
2010
CJK Unified Ideographs
Unicode 6.0 added CJK Unified Ideographs Extension D in the SIP with 222 characters, bringing the total to 74,616 characters.

This contents of the box above is based on material from the Wikipedia articles Universal Character Set characters, Box-drawing characters, DIN 91379, Plane (Unicode), Universal Coded Character Set, Han unification, CJK Unified Ideographs, IDN homograph attack, XeTeX, Bidirectional text & Hearts in Unicode, which are released under the Creative Commons Attribution-ShareAlike 4.0 International License.

See Also