simoncozens.github.io

Simon Cozens technical blog

Android Text Clipping

Recently I had to understand how Android interprets a font’s vertical metrics, and when there will be or will not be clipping in a text box. And the answer is: “it depends”.

Read More

Avoiding GSUB Overflows

Here’s some techniques adapted from an email I sent recently - a project with complicated contextual substitution rules was having problems with GSUB tables overflowing and refusing to compile. Follow me for a deep dive into contextual chaining rules, binary layout, and why compilers are stupid so we have to be clever…

Read More

FontOps: Font Development At Scale

Two years ago, I took over the development of (and later, the technical programme management) of the Noto project, a library of nearly 250 font families spanning over 150 writing systems. Noto provides “fallback” fonts for language support on Android, iOS, MacOS and Linux - billions of devices across the planet, and for many writing systems, Noto’s fonts are the only fonts available. How on earth can we manage and maintain such a diverse and important catalogue?

Read More

Advanced Glyph Reordering

Yesterday I was re-engineering a font to solve some reported issues, and I ended up implementing an OpenType Layout technique which deserves to be better known. But first you’re going to have sit through a rant.

Read More

Semiautomated Handwriting Fonts

One of the things which really drives me is resourcing underserved languages, particularly in terms of digital inclusion. Some of the most exciting times of my missionary years was when I was working for the Asia Pacific Sign Language Development Association, delivering digital content for Deaf communities, and the most exciting part of my work now is knowing that what I do has the potential to bring additional expressiveness to digitally underserved languages. But as well as digital side, what about the analogue side? Is there a potential for fonts to be useful in language development, increasing literacy and promoting the use of minority or emerging scripts within their communities?

Read More

FontTools Serialization - a note to self

I’ve previously mentioned a problem in OpenType called table packing; it’s the problem of assigning byte offsets to link the various subtables inside a GPOS/GSUB table so that no offset goes over the maximum limit of 64k. It’s a very hard problem algorithmically, but it’s also a problem for me personally in my workflow.

Read More

Nastaliq Dot Positioning

This week I have been working on some last-minute changes to a Nastaliq Urdu font. It’s been a very annoying week. I have not enjoyed this week. But looking back, it was worth it.

Read More

Five Rust tips for Python Programmers

I’m writing a lot of Rust these days, including a bunch of font tooling in Rust. If you haven’t looked into Rust, now is the time. There are quite a few of us in the font community playing with it, and I imagine there will be strong pushes in the future to move more of the font ecosystem in the Rust direction. Two new open source font editors - runebender and MFEKglif are both written in Rust, and I’m working on a Rust font building toolchain which is literally hundreds of times faster than Python fontmake. (I’d love to be able to say “thousands”, but not there yet; still, it’s fast enough that I get antsy when I have to wait more than three seconds for a font to build.)

Read More

Notes on efficient packing of OpenType layout

In terms of programming style, I’m a hacker. I code first and do the design as I’m going along. You can get pretty far with this. You can also get into terrible messes, and end up with restrictive architecture that takes ages to unpick afterwards. So sometimes you need to think first and program later. Currently I’m working on code in Rust to emit binary OpenType Layout tables, and this is definitely one of those think-first times. So here I am, thinking first.

Read More

Userspace and designspace - a note to self

Whenever I play with (or make) variable fonts, I get hopelessly confused about the difference between userspace and designspace coordinates. This is me trying to figure it out once and for all, and writing down what I find so that I won’t get confused next time.

Read More

Automated Kerning for Nastaliq

One of the funny things about programming is that you can take an operation which is fundamentally pretty dumb stupid, and by doing it in an automated and methodical fashion, come up with a result that is quite impressive.

Read More

Better Fonts Through Test-Driven Development

Fonts are, increasingly, pretty complex pieces of software. I work primarily on the layout side, creating (both manually and semi-automatically, through scripting) large collections of OpenType shaping rules for fonts with complex layout requirements. But writing the rules is half the battle. How do you know they work? How do you ensure that, at the end of the day, the collection of rules you’ve written actually produces the results that you expect, in all the cases that you expect?

Read More

How I implemented the Universal Shaping Engine in 200 lines of code

I’ve recently been working on implementing an OpenType shaping engine in Python. I’ll explain more about why another time but the short answer is (1) for Flux to visualize layout rules (including activating individual lookups and rules) without writing everything out to disk each time, and (2) so FEE plugins can reason about the effect of previous rules while working out what they want to do.

Read More

What did I do this week? (w/b 2020-07-20)

I’ve been working as a freelance font engineer for nearly two months now, and have been keeping sporadic notes on what I’ve been doing, but inspired by Dan Reynolds’ “Freelance Diary” I thought I’d try making them public. I don’t know if I have the discipline to make this a regular feature - I don’t really have the discipline to write my work log each day. There may also be some stuff I can’t talk about due to client confidentiality. But here goes:

Read More

fontFeatures and the "OpenType CPU"

Recently I’ve been working on a new idea which I think is worth sitting down and explaining carefully. It started out as a way to represent the layout features inside an OpenType font so that they could be manipulated programatically, rather than having to shuffle little bits of AFDKO code around. But it’s kind of grown a bit since then, as ideas tend to do.

Read More

A Python Data Storage Trick

Here’s a trick I just came up with. When messing about with training neural networks, I can sometimes find myself generating and manipulating hundreds of thousands of bitmap images, which are essentially binary matrices. With font data you can generate these on the fly and feed them straight into your network, but it turns out that takes a lot of RAM, and it’s more efficient to pre-compute them.

Read More

Neural Kerning Log

I know I said I wasn’t going to do anything more about kerning/spacing with neural networks. But, well, I have a GPU and it’s bored. I have also realised that, for the sake of posterity (and to stop myself from going around in circles doing things I’ve already tried), I should write up all the things I’ve learned and what has worked (very little) and what hasn’t (pretty much everything).

Read More

fontmetrics library

So, I couldn’t help it; I’ve fallen off the neural kerning wagon again, thanks to a fascinating article by Sebastian Kosch describing an “attention-based approach” to the spacing/kerning problem.

Read More

Brownie

I’ve just uploaded to github my repository for Brownie, a tool to help find photos. I started working on this once Adobe switched off the map feature of Lightroom 5.

Read More

More on newbreak

I had another mess around with newbreak on Friday; this is my hyphenation-and-justification engine. The main idea behind newbreak is that text can be stretchy as well as space, something we need for virtual fonts - perhaps for effect in display situations using Latin script, but something that’s really needed for non-Latin scripts.

Read More

Regression works!

Yesterday I mentioned the idea of using regression instead of one-hot encoding for co-ordinates. Which is, let’s face it, a much more sensible approach. I guess what held me back was a couple of things: my experience of doing regression with neural networks has not been great so far, and that there is an additional structure in off-curve points in that, for a smooth connection, the incoming angle between a handle and a node is always equal to the outgoing angle between the node and the next handle. But anyway, I tried it, and after many epochs and reducing the LR substantially, I started getting things like this:

Read More

Neural font design

I’ve been messing around recently with the idea of getting a neural network to spit out characters. It’s not something I want to spend a lot of time on and develop into something serious, but it’s quite a fun little project anyway.

Read More

A Duffer's guide to Fontconfig and Harfbuzz

I’m working on switching the font shaping part of SILE to use Harfbuzz instead of Pango because reasons, and have found myself a bit hampered by the lack of useful documentation. To be fair, if you actually build HB from source you get an auto-generated API reference, but there’s nothing really explaining how to go from a string of characters to a set of glyph positioning information, which is a shame because that is what Harfbuzz is for.

Read More