Unicode: we've all heard of it, but what is it exactly? There are a lot of misunderstandings around it: is it a 16-bit encoding? Is it only an encoding, or is it something larger? Are there special requirements as programmers or users to use Unicode and the related facilities? We will introduce the concept of Unicode: what it originally was, compared to what it is now. We focus on its key concepts: What is a code point? A grapheme? What is normalization, what does it change? How does Unicode change the way we manipulate text? As developers, what should we do to take that into account? This quick overview aims to demystify everything by giving a quick overview of the basics of Unicode, the traps one might encounter when dealing with it, and how to avoid them.
Speakers: Lucas Bajolet