Yesterday I was re-engineering a font to solve some reported issues, and I ended up implementing an OpenType Layout technique which deserves to be better known. But first you’re going to have sit through a rant.
If you’re doing any kind of work with Indic or Brahmic scripts, you’re going to love what I have to say today.
One of the things which really drives me is resourcing underserved languages, particularly in terms of digital inclusion. Some of the most exciting times of my missionary years was when I was working for the Asia Pacific Sign Language Development Association, delivering digital content for Deaf communities, and the most exciting part of my work now is knowing that what I do has the potential to bring additional expressiveness to digitally underserved languages. But as well as digital side, what about the analogue side? Is there a potential for fonts to be useful in language development, increasing literacy and promoting the use of minority or emerging scripts within their communities?
OK, here’s the lead - you can do amazing things with COLRv1 fonts:
I’ve previously mentioned a problem in OpenType called table packing; it’s the problem of assigning byte offsets to link the various subtables inside a GPOS/GSUB table so that no offset goes over the maximum limit of 64k. It’s a very hard problem algorithmically, but it’s also a problem for me personally in my workflow.
This week I have been working on some last-minute changes to a Nastaliq Urdu font. It’s been a very annoying week. I have not enjoyed this week. But looking back, it was worth it.
(with the open source tool chain.)
I’m writing a lot of Rust these days, including a bunch of font tooling in Rust. If you haven’t looked into Rust, now is the time. There are quite a few of us in the font community playing with it, and I imagine there will be strong pushes in the future to move more of the font ecosystem in the Rust direction. Two new open source font editors - runebender and MFEKglif are both written in Rust, and I’m working on a Rust font building toolchain which is literally hundreds of times faster than Python fontmake. (I’d love to be able to say “thousands”, but not there yet; still, it’s fast enough that I get antsy when I have to wait more than three seconds for a font to build.)
In terms of programming style, I’m a hacker. I code first and do the design as I’m going along. You can get pretty far with this. You can also get into terrible messes, and end up with restrictive architecture that takes ages to unpick afterwards. So sometimes you need to think first and program later. Currently I’m working on code in Rust to emit binary OpenType Layout tables, and this is definitely one of those think-first times. So here I am, thinking first.
Here’s another thing which I always forget when I’m building variable fonts, and when I say “building variable fonts”, I mean “building things which build variable fonts”.
Whenever I play with (or make) variable fonts, I get hopelessly confused about the difference between userspace and designspace coordinates. This is me trying to figure it out once and for all, and writing down what I find so that I won’t get confused next time.
One of the funny things about programming is that you can take an operation which is fundamentally pretty dumb stupid, and by doing it in an automated and methodical fashion, come up with a result that is quite impressive.
Fonts are, increasingly, pretty complex pieces of software. I work primarily on the layout side, creating (both manually and semi-automatically, through scripting) large collections of OpenType shaping rules for fonts with complex layout requirements. But writing the rules is half the battle. How do you know they work? How do you ensure that, at the end of the day, the collection of rules you’ve written actually produces the results that you expect, in all the cases that you expect?
I’ve recently been working on implementing an OpenType shaping engine in Python. I’ll explain more about why another time but the short answer is (1) for Flux to visualize layout rules (including activating individual lookups and rules) without writing everything out to disk each time, and (2) so FEE plugins can reason about the effect of previous rules while working out what they want to do.
I’ve been working (and tweeting) a lot about getting Nastaliq script fonts right recently, and Mark asked an astute question:
I’ve been working as a freelance font engineer for nearly two months now, and have been keeping sporadic notes on what I’ve been doing, but inspired by Dan Reynolds’ “Freelance Diary” I thought I’d try making them public. I don’t know if I have the discipline to make this a regular feature - I don’t really have the discipline to write my work log each day. There may also be some stuff I can’t talk about due to client confidentiality. But here goes:
Recently I’ve been working on a new idea which I think is worth sitting down and explaining carefully. It started out as a way to represent the layout features inside an OpenType font so that they could be manipulated programatically, rather than having to shuffle little bits of AFDKO code around. But it’s kind of grown a bit since then, as ideas tend to do.
This update just went out to GitHub Sponsors.
Here’s a trick I just came up with. When messing about with training neural networks, I can sometimes find myself generating and manipulating hundreds of thousands of bitmap images, which are essentially binary matrices. With font data you can generate these on the fly and feed them straight into your network, but it turns out that takes a lot of RAM, and it’s more efficient to pre-compute them.
I know I said I wasn’t going to do anything more about kerning/spacing with neural networks. But, well, I have a GPU and it’s bored. I have also realised that, for the sake of posterity (and to stop myself from going around in circles doing things I’ve already tried), I should write up all the things I’ve learned and what has worked (very little) and what hasn’t (pretty much everything).
So, I couldn’t help it; I’ve fallen off the neural kerning wagon again, thanks to a fascinating article by Sebastian Kosch describing an “attention-based approach” to the spacing/kerning problem.
I’ve just uploaded to github my repository for Brownie, a tool to help find photos. I started working on this once Adobe switched off the map feature of Lightroom 5.
I had another mess around with newbreak on Friday; this is my hyphenation-and-justification engine. The main idea behind newbreak is that text can be stretchy as well as space, something we need for virtual fonts - perhaps for effect in display situations using Latin script, but something that’s really needed for non-Latin scripts.
Yesterday I mentioned the idea of using regression instead of one-hot encoding for co-ordinates. Which is, let’s face it, a much more sensible approach. I guess what held me back was a couple of things: my experience of doing regression with neural networks has not been great so far, and that there is an additional structure in off-curve points in that, for a smooth connection, the incoming angle between a handle and a node is always equal to the outgoing angle between the node and the next handle. But anyway, I tried it, and after many epochs and reducing the LR substantially, I started getting things like this:
I’ve been messing around recently with the idea of getting a neural network to spit out characters. It’s not something I want to spend a lot of time on and develop into something serious, but it’s quite a fun little project anyway.
I’m working on switching the font shaping part of SILE to use Harfbuzz instead of Pango because reasons, and have found myself a bit hampered by the lack of useful documentation. To be fair, if you actually build HB from source you get an auto-generated API reference, but there’s nothing really explaining how to go from a string of characters to a set of glyph positioning information, which is a shame because that is what Harfbuzz is for.