Talk Abstract: Character encodings can be confusing for every developer, providing pitfalls even for the most experienced ones.  As a result, a lot of the time we end up with something that “just works” without an in-depth understanding of the involved concepts and what impact they can have. In this talk, Anna will give an overview of what Character encodings are, what the JavaScript language provides to interact with them, and how to avoid the most common mistakes in Node.js and the Web.

Why is this event important to you?

I’m talking about a topic that is important to me and that is rare to see at conferences, particularly in an English-speaking country (as native English speakers tend to have a different connection to the topic due to English being ASCII-only).

Have you attended this event before?

No, this will be my first time so I’m very excited to attend.

What will your QCon talk be about?

I will talk about character encodings, their history, the common mistakes that developers tend to make when working with them and why happen. I’ll also talk about the APIs for interacting with them in Node.js and the browser. I hope to give some insight into the design choices that were made when creating them, too – for example, why UTF-8 holds the special place that it now does and what trade-offs that makes.

Why have you chosen this topic?

It’s an interesting topic for developers, something that lots of them just want to “make work” in some way but that are not always looked at in-depth, leading to situations in which the software does the right thing 99 % of the time. It’s also something that shows up in the Node.js GitHub issue trackers on a regular basis – even with UTF-8 set as “the” standard encoding for modern applications, there’s still a lot that can go wrong.

What do you hope it will accomplish?

It will give developers an overview over APIs to deal with Character Encodings, understand how they work and why the common mistakes appear and why they are common, as well as some general insight into internationalization issues. I also hope that talking about the trade-offs made when designing character encodings will be useful when looking at the design choices of other software APIs.

What have been the most notable changes in Character Encodings over the years?

I think that’s mostly the gradual introduction of the `TextEncoder`/`TextDecoder` web API, as far as browsers and Node.js are concerned, as well as maybe some standardization in JS engines around how to deal with invalid input. This talk isn’t necessarily about recently introduced or bleeding-edge features, though.

Share Me

Related Reading


Don’t miss a beat

Get all the latest NearForm news, from technology to design. Sign up for our newsletter.

Follow us for more information on this and other topics.