this space intentionally left blank

December 13, 2023

Filed under: tech»web»components

What, When, Where: Event-driven apps from modern browser primitives

React's big idea was always the render function. Even at its initial presentation in 2013, the developers were very clear that the original class syntax was just a thing they added to meet contemporary expectations. They also, at the time, stressed that React could be mixed into other code. You could migrate your application over to it piecemeal, taking advantage of the speed improvements they promised in hot spots.

Over the following decade, React took over the whole application space, but conceptually it never moved past render(), and in fact almost everything else was gradually stripped away. When the deprecation of class components removed local state and lifecycle methods, they were replaced with stores like Redux, or contexts, and eventually hooks — all of which are complex and come with a laundry lists of caveats and limitations, but they "solve" the problems caused by eliminating everything that isn't a pure function. The history of the entire project has been constant, downward pressure, moving everything into the view callback. It's all one undifferentiated slab of JSX now.

Perhaps this marks me as a radical, but my thesis is that it may not be beneficial to try to reduce your solutions until they can fit in a cramped, ideologically-constrained display layer. I think it's a good thing when an application has a little flexibility depending on the problems relevant to each part, just as it's good to build a house out of different materials instead of just pouring concrete into a giant mold and calling it a day.

When critics say that web components are incomplete compared to React and its competitors, they're not wrong: if you want One Weird Trick for your entire codebase, you'll be disappointed. But if you're using web components, it may be useful to ask whether you can get many of the benefits of frameworks — live updates, cross-cutting data, loose coupling — without going down the same rabbit holes or requiring the extensive build infrastructure they depend on. It's worth thinking about what we could do if we used the platform to fill those gaps, and for me that starts with events.

Subscribable stores

A common problem: I want to share some state across components that are not located close to each other in the UI tree, and be notified when that state changes so I can re-render.

The most basic solution to this is an event emitter with getter/setter methods wrapping its value. Back in the bad old days, you'd have to roll your own, but EventTarget (the common interface for all DOM classes that dispatch events) has been widely subclassable for a few years now. Our store definition probably looks something like this:

class Store extends EventTarget {
  state = undefined;
  
  constructor(initial) {
    super();
    this.state = initial;
  }

  get value() {
    return this.state;
  }
  
  set value(state) {
    if (this.state == state) return;
    this.state = state;
    this.notify("update", state);
  }
  
  //convenience method for atomic get/set
  update(fn) {
    this.value = fn(this.state)
  }
  
  notify(type, detail) {
    this.dispatchEvent(new CustomEvent(type, { detail }));
  }
}

When we want to use this, it's largely similar to the way that a "context" works in other frameworks. You set up a store in a module, and then in places where that data is important, you import it and either subscribe, update its value, or both. Depending on your base class and your templating, you can even auto-subscribe to it in the course of rendering — remember, addEventListener() automatically de-duplicates listeners, so it's safe to call it redundantly as long as you're passing in the same reference (i.e., use a bound method or a handler object, not a fresh arrow function).

This particular store would need to be adapted if your data is deeply-nested, or if you're planning to mutate it in place, since it only notifies subscribers if the reference identity of its data changes. One option would be to build a proxy-based reactive object, similar to what Vue uses, which can be done in about a hundred lines of code for the basics. You could just import @vue/reactivity, of course, but it's educational to do it yourself.

The subscribable store can be designed with a particular shape of object or collection in mind, and offer methods for working with that shape. In my podcast client, I use a Table class that provides promised-based IndexedDB access and fires events whenever feeds are added, removed, or updated in the database.

My other favorite use case for subscriptions is anything based on external stimuli, such as network polling or push notifications, especially if that external source has rich, non-uniform behavior (say, a socket that syncs state with the server, but also lets you know when your app doesn't have network connectivity so that the UI can disable some features).

This design for reactivity is no longer fashionable, but the pace of JavaScript's pop culture makes it easy to forget that it was only 2019 when Svelte v3 (to pick an example) moved from an explicitly event-driven stores to the current syntax sugar. Behind the scenes, the store contract is still basically an event dispatcher with only one event, it's just hidden by the compiler. If we don't use a compiler, we may have to subscribe by hand, but on the other hand we won't be caught on an update treadmill when the framework devs discover observables (sorry, "runes") four years later.

Personally, I don't think it actually matters very much how you get notified for low-level re-renders — if you wanted to argue that a modern framework uses signals for granular reactivity, that's fine by me — but what I like about standardizing on EventTarget for news about high-level concerns is that it's already familiar, it's free with the browser, and it encourages us to think about changes to data more coherently than "a single value" or "a slice of a big state object."

Broadcast messages

The preoccupation with reducing everything to data transformation is a common blind spot in current front-end frameworks. Data is important, of course — I'm a firm believer in the Linus Torvalds maxim that good programmers worry more about structure than they do code — but sometimes something happens that doesn't create a notable change, or it creates different kinds of changes in different places, or it's a long-running process that we just want to keep an eye on. Not everything is a noun! We need verbs, too!

When I worked on Caret from 2014 to 2018 or so, I was learning a lot about how to structure a relatively large, complex application — certainly it was the biggest thing I'd ever built on my own. But one of the best decisions I made early on was to have different parts of the editor communicate with each other using command messages sent over a central pub/sub (publish and subscribe) channel.

This had a lot of advantages. Major systems didn't need to be tightly coupled together, especially around menus and keyboard shortcuts, which effectively transformed streams of input events into higher-level commands. Some web apps may be able to pretend that the real work is safely isolated from side effects and statefulness, but a programmer's text editor has to deeply care about making input both easy and extensible. And as Caret went from a basic Notepad.exe replacement to a much more full-featured editor, its vocabulary of commands expanded naturally.

Take live settings, for example: Caret saved user preferences in JSON "files," which were persisted to Chrome's synchronized storage. When these changed, the settings provider would send an "init:restart" announcement over the command bus, and modules that used these files would reload their configuration to match. Importantly, the provider did not need to know which systems were listening, or what specific options they cared about (if any). The command was explicit, auditable, and self-explanatory, as opposed to a reactive framework where the settings object changes and a half-dozen other modules spontaneously reload themselves.

Like our subscribable store, the message channel is a subclass of EventTarget. It doesn't retain a value, but it can have a method to simplify the event creation and dispatch process. Instantiate that class, export it from the module, and import it anywhere you want to listen to messages.

class MessageBus extends EventTarget {
  broadcast(type, detail) {
    var e = new CustomEvent(type, { detail });
    this.dispatchEvent(e);
    return e;
  }
}

export const channel = new MessageBus();

I recommend namespacing (and probably defining constants for) your type strings early, since they're going to be sent far and wide: "change" or "update" isn't very useful when lots of things could be changing/updating, but "session:saved" (with the filename attached to the event detail) means you're less likely to collide with other messages, especially on a team.

The main thing I regret from Caret was not having a way for event consumers to send values back to the broadcaster directly. There was an optional callback in the event emitter code, but it was awkward to use, especially if multiple listeners wanted to return values. If I were building it now, I would imitate the Service Worker API and offer a respondWith() method on events:

class RespondableEvent extends Event {
  #responses = [];
  
  constructor(type, data) {
    super(type);
    this.data = data;
  }
  
  respondWith(response) {
    this.#responses.push(response);
  }
  
  // make responses add-only and async
  get responses() {
    return Promise.all(this.#responses);
  }
}

Listeners that need additional time to prepare can respond with a Promise instead of a direct value, meaning that this also doubles as a waitUntil() method. On the other end, the broadcasting module holds onto a reference to the event, and checks to see if it needs to take further action. In Caret, this would have been really useful for providing extension points like language servers or build automation:

var e = new RespondableEvent("file:beforesave", fileContents);
channel.dispatchEvent(e);
// check to see if any plugins responded
var annotations = await e.responses;
// add annotations to the editor control
for (var annotation of annotations) {
  /* ... */
}

Scenarios where asynchronous event responses are necessary are rare, but when you need them, they're invaluable, and this design doesn't add any overhead when not used.

Software as Metaphor

Conway's Law in software development says that the systems designed by an organization are a reflection of its communication structures. I would take that further: the systems we design are, at least a little, a reflection of the way we want the world to work. Part of the reason I like using event-driven architectures is because they effectively create chatty little communities within the program — colonial organisms, like a Portuguese man o' war (though hopefully less dangerous) — and for all my misanthropic tendencies, I do still believe we live in a society.

More importantly, this is a way to think about high-level architecture, but it does not have to be the single method for every part of the app. As I said at the start, I'm suspicious of all-encompassing framework paradigms. If our software is a microcosm of our ideal environment, there's something worrying about reducing all processes to a "pure" transform of input and output.

Web components are not a complete framework in the way that React (or Vue, or Svelte) is. They're just one layer of an application. You can see that as a flaw, but I think it's an opportunity to go back to software that has texture to it, where the patterns that are used at the top level do not have to be the exact same as those used in individual modules, or at lower layers of the stack, if it turns out that they're not well-suited to the problem at hand.

And to be fair, outside of React we see a lot more experimentation with forms of coordination that aren't tied so tightly to one particular VDOM. Preact's signals, for example, provide a level of reactivity that can be used anywhere, and which you could easily integrate with the architecture I've described (listeners updating signal values, and effect functions dispatching events).

I don't think web components are the only reason for that, but I do think their existence as a valid alternative — a kind of perpetual competition to framework code, in which you can get started without a single import statement or npm install — means that there's greater incentive to build primitives that are interoperable, not locked to a single ecosystem.

In the context of that reset, events make sense to me in terms of organizing my code, but I'm hopeful that we'll soon find other techniques re-emerging from the vast prior art of UI toolkits, both native and on the web. There's a lot more out there than closures and currying, and I can't wait to see it.

December 10, 2023

Filed under: tech»web»components

Cheap and Cheerful: trivial tricks for custom elements

I have one draft written about framework-free architecture, but it gets a little heavy, and I don't want to go there yet. So instead, here are some minor uses of custom elements that, for me, spark joy.

Set dynamic CSS variables

One of the things that cracks me up when I've been reading about React contexts is how they're almost always demonstrated on (and encouraged for) visual theming. Did nobody tell the developers that CSS... cascades? It's in the name! This is what custom properties are for! (Actual answer: they know and they don't care, because React was designed by people who would rather build countless levels of ideological abstraction than actually use the browser in front of them.)

CSS custom properties have been fairly well-supported since 2016, which is when Chrome and Safari shipped them. When I first started using them, they felt like a step back from the Less variables that I was used to, with the clunky var() syntax required to unwrap them, and calc() to operate on their contents. But custom properties are actually a lot more powerful because they do cascade — you can override them for sections of the DOM tree selectively — and because they can be updated on the fly, either using media queries or values from JavaScript.

Web components are a great way to take dynamic information and make it available for styling, since any CSS properties they set on themselves will then cascade down through the rest of the tree. Imagine we have the following markup:

<mouse-colorizer>
  <h1 style="color: var(--mouse-color, salmon)">
    This space intentionally left blank.
  </h1>
</mouse-colorizer>

If we don't do anything, the inner <h1> will be salmon-colored. But if we define the mouse colorizer element...

class MouseColorizer extends HTMLElement {
  constructor() {
    super();
    this.addEventListener("mousemove", this);
  }

  handleEvent(e) {
    var normalX = e.offsetX / this.offsetWidth;
    var normalY = e.offsetY / this.offsetHeight;
    this.style.setProperty(
      "--mouse-color",
      `rgb(${255 * normalX}, 255, ${255 * normalY})`
    );
  }
}

customElements.define("mouse-colorizer", MouseColorizer);

Now moving the mouse around inside the <mouse-colorizer>'s bounding box will set the color of the headline, between green, cyan, yellow, and white at each corner. Other elements inside this branch of DOM can also use this variable, for any style where a color value is valid: borders, background, shadows, whatever. There are lots of serious uses for being able to set dynamic cascade values, but I think it's just as valuable when it's a little mischievious:

  • Parallax effects that shift with the device orientation (GitHub's 404 page used to do this, but they seem to have dropped it lately)
  • Animations that progress as the page scrolls
  • Tinting images based on the time of day
  • Fonts that get just a little blurrier every time you touch the screen

While at NPR, I was working on a project for Louder than a Riot (RIP), one of the company's music podcasts. We wanted to give the page some life, and tie the visuals to the heavy use of audio samples. Animating the whole page with JavaScript was possible, but CSS custom properties gave us an easier solution. I hooked up the player to a WebAudio analyzer node, and had it dispatch events with the average amplitude of the sample window. Then, I set up a <speaker-boxxx> element that listened for those events, and set its --volume property to match. Styles inside of a <speaker-boxxx> could use that value to set color mixes and transforms (or any other style), so that when a track was playing, UI elements all over the page grooved along with it.

Wrap old libraries

A couple of weeks ago I needed a map for a story (people love maps, and although they're rarely the appropriate choice for data visualization, in this case it made sense). Actually, I needed two maps: the reporter wanted to compare the availability of 8th grade algebra over the last ten years.

My go-to library for mapping is Leaflet. It's long in the tooth at this point, and I'm sure that there are other libraries that would offer features like vector tiles or GPU-accelerated rendering. But Leaflet is fast, and it's well-proven, so I've stuck with it.

The thing is, if you need multiple maps, Leaflet can be kind of a pain. It works on the jQuery UI model, where you have to point it at an element and call a factory function to set it up, and then you have a map object that's separate from the DOM that it's attached to. It sure would be nice if the page just took care of that — I don't know, like a callback when the right element gets connected or something.

Well.

class MapElement extends HTMLElement {

  connectedCallback() {
    this.map = new leaflet.Map(this, {
      zoomSnap: .1,
      scrollWheelZoom: false,
      zoomControl: false,
      attributionControl: false
    });
    this.map.focus = () => null;

    leaflet.tileLayer("https://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}.png", {
      subdomains: "abcd".split("")
    }).addTo(this.map);
  }
}

Now I can get a list of maps just by running a query for leaflet-map elements, and populate them based on a filter value that I take from their data-year attributes. It's not that you can't write this in a DRY way, but it feels cleaner to me if I can just hand off the setup to the browser.

We talk a lot about using web components as glue between modern frameworks, but they're also really useful for wrapping up non-framework code that you don't want to think about. If you're maintaining a legacy site that uses older widget libraries, it's worth thinking about whether some of them can be replaced with custom elements to clean up some of the boilerplate and lifecycle management you're currently managing manually.

Add harmless interaction effects

When custom elements were first introduced, Google tried to make them more palatable by packaging them as Polymer, a library of tags and tools. It's hard to say why Google kills anything, but it didn't help that the framework leaned heavily on HTML imports, which (sadly) did not make it out of standardization.

I was pretty lukewarm on Polymer, but one thing I did like was that it had a set of paper-x tags that implemented bits of Material Design, like the ripple effect for clicks that you can still see in some Google UI. These are good candidates for custom elements, because they're client-only and if JavaScript doesn't load, they just don't do anything, but the page still works. They're an easy way to add personality to a design system.

Let's set up our own ripple layer tag as a demonstration. We'll start by creating some shadow DOM and injecting a canvas that's positioned absolutely within its container. We'll also register for pointer events, and bind the "tick" method that runs our animations:

var template = `
<style>
  :host {
    position: relative;
    display: block;
  }

  canvas {
    position: absolute;
    inset: 0;
    width: 100%;
    height: 100%;
    background: transparent;
  }
</style>
<canvas></canvas>
<slot></slot>
`;

class RippleLayer extends HTMLElement {
  ripple = null;

  constructor() {
    super();
    var root = this.attachShadow({ mode: "open" });
    root.innerHTML = template;
    var canvas = root.querySelector("canvas");
    this.context = canvas.getContext("2d");
    this.tick = this.tick.bind(this);
    this.addEventListener("pointerdown", this);
  }
}

With the infrastructure in place, we'll watch for pointer events and start a ripple when one occurs, storing the important information about when and where the click or tap occurred:

handleEvent(e) {
  this.releasePointerCapture(e);
  this.ripple = {
    started: Date.now(),
    x: e.offsetX,
    y: e.offsetY
  }
  this.tick();
}

Finally, we'll add the tick() method that actually draws the ripple. Our animation basically just looks at the current time and draws a circle based on where it should be at that point — we don't need to retain any information from one frame to the next.

tick() {
  // resize the canvas buffer 1:1 with its CSS size
  var { canvas } = this.context;
  canvas.width = this.offsetWidth;
  canvas.height = this.offsetHeight;
  // find out how far in the ripple we've gotten
  var elapsed = Date.now() - this.ripple.started;
  var duration = this.getAttribute("duration") || 300;
  var delta = elapsed / duration;
  // if the ripple is complete, don't draw and stop animating
  if (delta >= 1) return;
  // determine the size of the ripple
  var eased = .5 - Math.cos(delta * Math.PI) / 2;
  var rMax = canvas.width > canvas.height ? canvas.width * .8 : canvas.height * .8;
  // draw a darker outline and lighter inner circle
  this.context.arc(this.ripple.x, this.ripple.y, rMax * eased, 0, Math.PI * 2);
  this.context.globalAlpha = (1 - delta) * .5;
  this.context.stroke();
  this.context.globalAlpha = (1 - delta) * .2;
  this.context.fill();
  // schedule the next update
  requestAnimationFrame(this.tick);
}

Once this element is defined, you can put a <ripple-layer> anywhere you want the effect to occur, such as absolutely positioned in a button. This does require the "layer" to be on top — if you need to have multiple clickable items within a ripple zone, invert the tag by adding a slot, so that the layer wraps around content and adds an effect to it instead of vice-versa.

November 28, 2023

Filed under: tech»web»components

goodbytes: Designing Custom Element Base Classes

In my mind, Michael Crichton's Jurassic Park marks the last time object-oriented programming was cool. Dennis Nedry, the titular park's sole computer engineer, adds a backdoor to the system disguised as "a block of code that could be moved around and used, the way you might move a chair in a room." Running whte_rabt.obj as a shell script turns off the security systems and electric fences, kicking off the major crisis that drives the novel forward. Per usual for Crichton, this is not strictly accurate, but it is entertaining.

(Crichton produced reactionary hack work — see: Rising Sun, Disclosure, and State of Fear — roughly as often as he did classic high-tech potboilers, but my favorite petty grudge is in The Lost World, the cash-grab sequel to Jurassic Park, which takes a clear potshot at the "This is a UNIX system, I know this!" scene from Spielberg's film: under siege by dinosaurs, a young woman frantically tries to reboot the security system before suddenly realizing that the 3D graphics onscreen would require a high-bandwidth connection, implying — for some reason — a person-sized maintenance tunnel she can use as an escape route. I love that they can clone dinosaurs, but Jurassic Park engineers do not seem to have heard of electrical conduits.)

In the current front-end culture, class-based objects are not cool. React is (ostensibly) functional wherever possible, and Svelte and Vue treat the module as the primary organizational boundary. In contrast, web components are very much built on the browser platform, and browsers are object-oriented programs. You just can't write vanilla JavaScript without using new, and I've always wondered if this, as much as anything else, is the reason a lot of framework authors seem to view custom elements with such disdain.

Last week, I wrote about slots and shadow DOM as a way to build abstract domain-specific languages and expressive web components. In this post, I want to talk about how base classes and inheritance can smooth out its rough edges, and help organize and arrange the shape of your application. Call me a dinosaur (ha!), but I think they're pretty neat.

Dino DNA

Criticisms of custom elements often center around the amount of code that it takes to write something fairly simple: comparing the 20-line boilerplate of a completely fresh web component against, say, a function with some JSX in it. For some reasons, these comparisons never discuss how that JSX is transpiled and consumed by thousands of lines of framework dependencies — that's just taken for granted — or that some equivalent could also exist for custom elements.

That equivalent is your base class. Rather than inheriting directly from HTMLElement, you inherit from a middleware class that extends it, and fills in the gaps that the browser doesn't directly provide. Almost every project I work on either starts with a base element, or eventually acquires one. Typically, you'll want to include:

  • Some kind of templating for the shadow DOM, and optionally for the light DOM.
  • Code that reflects observed attributes to properties, or vice versa.
  • Method binding, for event listeners and callbacks.
  • Event dispatching, using either CustomEvent or a subclass for your application.

If you don't feel capable of providing these things, or you're worried about the maintenance burden, you can always use someone else's. Web component libraries like Lit or Stencil basically provide a starter class for you to extend, already packed with things like reactive state and templating. Especially if you're working on a really big project, that might make sense.

But writing your own base class is educational at the very least, and often easier than you might think, especially if you're not working at big corporate scale. In most of my projects, it's about 50 lines (which I often copy verbatim from the last project), and you can see an example in my guidebook. The templating is the largest part, and the part where just importing a library makes the most sense, especially if you're doing any kind of iteration. That said, if you're mostly manipulating individual, discrete elements, a pattern I particularly like is:

class TemplatedElement extends HTMLElement {
  elements = {};

  constructor() {
    super();
    // get the shadow root
    // in other methods, we can use this.shadowRoot
    var root = this.attachShadow({ mode: "open" });
    // get the template from a static class property
    var { template } = new.target;
    if (template) {
      root.innerHTML = template;
      // store references to marked template elements
      for (var element of root.querySelectorAll("[as]")) {
        var name = element.getAttribute("as");
        this.#elements[name] = element;
      }
    }
  }
}

From here, a class extending TemplatedElement can set a string as the static template property, which will then be used to set up the shadow DOM on instantiation. Any tag in that template with an "as" attribute will be stored on the elements lookup object, where we can then add event listeners or change its content:

class CounterElement extends TemplatedElement {
  static template = `
<div as="counter">0</div>
<button as="increment">Click me!</button>
  `;
  
  #count = 0;
  
  constructor() {
    // run the base class constructor
    super();
    // get our cached shadow elements
    var { increment, counter } = this.elements;
    increment.addEventListener("click", () => {
      counter.innerHTML = this.#count++;
    });
  }
}

It's simple, but it works pretty well, especially for the kinds of less-intrusive use cases that we're seeing in the new wave of HTML components.

For the other base class responsibilities, a good tip is to try to follow the same API patterns that are used in the platform, and more specifically in JavaScript in general (a valuable reference here is the Web Platform Design Principles). For example, when providing method binding and property reflection, I will often build the interface for these as arrays assigned to static properties, because that's the pattern already being used for observedAttributes:

class CustomElement extends BaseClass {
  static observedAttributes = ["src", "controls"];
  static boundMethods = ["handleClick", "handleUpdate"];
  static reflectedAttributes = ["src"];
}

I suspect that once decorators are standardized, they'll be a more pleasant way to handle some of this boilerplate, especially since a lot of the web component frameworks are already doing so via Typescript. But if you're using custom elements, there's a reasonable chance that you're interested in no-build (or minimal build) systems, and thus may want to avoid features that currently require a transpiler.

Clever Girl

If you are building web components entirely as leaf nodes that are meant to be inserted into an arbitrary page, or embedded into another framework, mimicking the platform is probably enough. For example, on an input-related element you might add a getter to your class that provides the valueAsNumber property just like the browser's own input tags.

But if you're designing larger applications, then your components will need to interact with each other. And in that case, a class is not just a way of isolating some DOM code, it's also a contract between application modules for how they manage state and communication. This is not new or novel — it's the foundation of model-view-controller UI dating back to Smalltalk — but if you've learned web development in the era since Backbone fell out of popularity, you may have never really had to think about state and interaction between components, as opposed to UI functions that all access slices of a common state store (or worse, call out to hooks and magically get served state from the aether).

Here's an example of what I mean: the base class for drawing instructions in Tarot, Chalkbeat's social media image generator, does the normal templating/binding dance in its constructor. It also has some utility methods that most canvas operations will need, such as converting between normalized coordinates and pixels or turning variable-length CSS padding strings into a four-item array. Finally, it defines a number of "stub" methods that subclasses are expected to override:

  • persist() and restore() transfer values between elements with the same ID when the user switches card layouts, triggered by the connected and disconnected callbacks.
  • getLayout() returns a DOMRect with the bounding box that the component plans to render to, so that parent elements can perform layout tasks like flex spacing.
  • draw() actually renders to a canvas context, usually based on the information that getLayout() provided.

When Tarot needs to re-render the canvas, it starts at the top level of the input form, loops through each direct child, and calls draw(). Some instructions, like images or rectangle fills, render immediately and exit. The layout brushes, <vertical-spacer> and <vertical-stack>, first call getLayout() on each of their children, and use those measurements to apply a transform to the canvas context before they ask each child to draw. Putting these methods onto the base class in Tarot makes the process of adding a new drawing type clear and explicit, in a way that (for me) the "grab bag of props" interface in React does not.

Two brushes actually take this a little further. The <series-logo> and <logo-brush> elements don't inherit directly from the Brush base class, but from a specialized subclass of it with properties and methods for storing and tinting bitmaps. As a result, they can take a single-color input PNG and alter its pixels to match any of the theme colors selected while preserving alpha, which means we can add new brand colors to the app and not have to generate all new logo art.

Planning the class as an API contract means that when they're slotted or placed, we can use duck-typing in our higher-level code to determine whether elements should participate in a given operation, by checking whether they have a method name that matches our condition. We can also use instanceof to check if they have the required base class in their prototype chain, which is more strict.

Hold Onto Your Butts

It's worth noting that this approach has its detractors, and has for a (relatively) long time. In 2015, the React team published a blog post claiming that traditional object-oriented code inherently creates tight coupling, and the code required grows "as the square of the number of possible states of the component." Personally I find this disingenuous, especially when you step back and think about the scale of the infrastructure that goes into the "easier" rendering method it describes. With a few small changes, it'd be indistinguishable from the posts that have been written discounting custom elements themselves, so I guess at least they're consistent.

As someone who cut their teeth working in ActionScript 3, it has never been obvious to me that stateful objects are a bad foundation for creating rich interfaces, especially when we look at the long history of animation libraries for React — eventually, every pure functional GUI seems to acquire a bunch of pesky escape hatches in order to do anything useful. Weird how that happens! My hot take is that humans are messy, and so code that interacts directly with humans tends to also be a little messy, and trying to shove it into an abstract conceptual model is likely to fail in frustrating ways. Objects are often untidy, but they give us more slack, and they're easier to map to a mental model of DOM and state relationships.

That said, you can certainly create bad class code, as the jokes about AbstractFactoryFactoryAdapter show. I don't claim to be an expert on designing inheritance — I've never even drawn a UML diagram (one person in the audience chuckles, glances around, immediately quiets). But there are a few basic guidelines that I've found useful so far.

Remember that state is inspectable. If you select a tag in the dev tools and then type $0.something in the console, you can examine a JS value on that element. You can also use console.dir($0) to browse through the entire thing, although this list tends to be overwhelming. In Chrome, the dev tools can even examine private fields. This is great for debugging: I personally love being able to see the values in my application via its UI tree, instead of needing to set breakpoints or log statements in pure rendering functions.

Class instances are great places for related platform objects. When you're building custom elements, a big part of the appeal is that they give you automatic lifecycle hooks for the section of the page tree that they wrap. So this might be obvious, but use your class to cache references to things like Mutation Observers or drawing contexts that are related to the DOM subtree, even if they aren't technically its state, and use the lifecycle to set them up and tear them down.

Use classes to store local state, not application state. In a future post, I want to write about how to create vanilla code that can fill the roles of stores, hooks, and other framework utilities. The general idea, however, is that you shouldn't be using web components for your top-level application architecture. You probably don't need <application-container> or <database-connection>. That's why you...

Don't just write classes for your elements. In my podcast client, a lot of the UI is driven by shared state that I keep in IndexedDB, which is notoriously frustrating to use. Rather than try to access this through a custom element, there's a Table class that wraps the database and provides subscription and manipulation/iteration methods. The components in the page use instances of Table to get access to shared storage, and receive notification events when something else has updated it: for example, when the user adds a feed from the application menu, the listing component sees that the database has changed and re-renders to add that podcast to the list.

Be careful with property/method masking. This is far more relevant when working with other people than if you're writing software for yourself, but remember that properties or methods that you create in your class definitions will supplant any existing fields that exist on HTMLElement For example, on one project, I stored the default slot for a component on this.slot, not realizing that Element.slot already exists. Since no code on the page was checking that property, it didn't cause any problems. But if you're working with other people or libraries that expect to see the standard DOM value, you may not be so lucky.

Consider Symbols over private properties to avoid masking. One way to keep from accidentally overwriting a built-in field name is by using private properties, which are prefixed with a hash. However, these have some downsides: you can't see them in the inspector in Firefox, and you can't access them from subclasses or through Proxies (I've written a deeper dive on that here). If you want to store something on an element safely, it may be better to use a Symbol instead, and export that with your base class so that subclasses can access it.

export const CANVAS = Symbol("#canvas");
export const CONTEXT = Symbol("#context");

export class BitmapElement extends HTMLElement {
  constructor() {
    super();
    this.attachShadow({ mode: "open" });
    this[CANVAS] = document.createElement("canvas");
    this[CONTEXT] = this[CANVAS].getContext("2d");
  }
}

The syntax itself looks a little clunkier, but it offers encapsulation closer to the protected keyword in other languages (where subclasses can access the properties but external code can't), and I personally think it's a nice middle ground between actual private properties and "private by convention" naming practices like this._privateButNotReally.

Inherit broadly, not deeply. Here, once again, it's instructive to look at the browser itself: although there are some elements that have extremely lengthy prototype chains (such as the SVG elements, for historical reasons), most HTML classes inherit from a relatively shallow list. For most applications, you can probably get away with just one "framework" class that everything inherits from, sometimes with a second derived class for families of specific functionality (such as embedded DSLs).

There's a part of me that feels like jumping into a wave of interest in web components with a tribute to classical inheritance has real "how do you do, fellow kids?" energy. I get that this isn't the sexiest thing you can write about an API, and it's very JavaScript-heavy for people who are excited about the HTML component trend.

But it also seems clear to me, reading the last few years of commentary, that a lot of front-end folks just aren't familiar with this paradigm — possibly because frameworks (and React in particular) have worked so hard to isolate them from the browser itself. If you try to turn web components into React, you're going to have a bad time. Embrace the platform, learn its design patterns on their own terms, and while it still won't make object orientation cool, you'll find it's a much more pleasant (and stable) environment than it's been made out to be.

November 21, 2023

Filed under: tech»web»components

Chiaroscuro, or Expressive Trees in Web Components

Over the last few weeks, there's been a remarkable shift in the way that the front-end community talks about web components. Led by a number of old-school bloggers, this conversation has centered around so-called "HTML components," which primarily use custom elements as a markup hook to progressively enhance existing light DOM (e.g., creating tabs, tooltips, or sortable tables). Zach Leatherman's taxonomy includes links to most of the influential blog posts where the discussions are taking place.

(Side note: it's so nice to see blogging start to happen again! Although it's uncomfortable as we all try to figure out what our new position in the social media landscape is, I can't help but feel optimistic about these developments.)

Overall, this new infusion of interest is a definite improvement from the previous state of affairs, which was mostly framework developers insisting that anything less than a 1:1 recreation of React or Svelte in the web platform was a failure. But the whiplash from "this API is useless because it doesn't bundle enough complexity" to "this API can be used in the simplest possible way" leaves a huge middle ground unexplored, including its most intriguing possibilities.

So in the interest of keeping the blog train rolling, I've been thinking about writing some posts about how I build more complex web components, including single-page apps that are traditionally framework territory, while still aiming for technical accessibility. Let's start by talking about slots, composition, and structure.

Starting from slots

I wrote a little about shadow DOM in 2021, right before NPR published the Science of Joy, which used shadow DOM pretty extensively. Since that time, I've rewritten my podcast client and RSS reader, thrown together an offline media player, developed (for no apparent reason) a Eurorack-esque synthesizer, and written a social card image generator just in time for Twitter to fall apart. Between them, plus the web component book I wrote while wrapping up at NPR, I've had a chance to explore the shadow DOM in much more detail.

I largely stand by what I said in 2021: shadow DOM is a little confusing, not quite as bad as people make it out to be, and best used in moderation. Page content wants to be in the light DOM as much as possible, so that it's easier to style, inspect, and access for scripting. Shadow DOM is analagous to private properties or Symbol keys in JS: it's where you put stuff that only that element (and its user) needs to access but the wider page doesn't know about. But with the addition of slots, shadow DOM is also the way that we can define the relationships of an element to its contents in a way that follows the grain of HTML itself.

To see why, let's imagine a component with what seems like a pointless shadow DOM:

class EmptyElement extends HTMLElement {
  constructor() {
    super();
    var root = this.attachShadow({ mode: "open" });
    root.innerHTML = "<slot></slot>";
  }
}
This class defines an element with a shadow root, but no private content. Instead, it just has a slot that immediately reparents its children. Why write a no-op shadow root like this?

One (minor) benefit is that it lets you provide automatic fallback content for your element, which is hard to do in the light DOM (think about a list that shows a "no items" message when there's nothing in it). But the more relevant reason is because it gives us access to the slotchange event, as well as methods to get the assigned elements for each slot. slotchange is basically connectedCallback, but for direct children instead the custom element itself: you get notified whenever the elements in a slot are added or removed.

Simple slotting is a great pattern if you are building wrapper elements to enhance existing HTML (similar to the "HTML components" approach noted above). For example, in my offline media player app, the visualizer that creates a Joy Division-like graph from the audio is just a component that wraps an audio tag, like so:

<audio-visuals>
  <audio src="file.mp3"></audio>
</audio-visuals>

When it sees an audio element slotted into its shadow DOM, it hooks it into the analyzer node, and there you go: instant WinAmp visualizer panel. I could, of course, query for the audio child element in connectedCallback, but then my component is no longer reactive, and I've created a tight coupling between the custom element and its expected contents that may not age well (say, a clickable HTML component that expects a link tag, but gets a button for semantic reasons instead).

Configuration through composition

Child elements that influence or change the operation of their parent is a pattern that we see regularly in built-ins:

  • Media elements (audio, video, and picture) get live input configuration from <source>
  • Subtitles are also loaded by placing a <track> inside an audio or video tag
  • Selectbox options on mobile are native UI generated from child elements
  • SVG filters contain a list of operation elements, some of which have their own child config tags (think <fePointLight> for the lighting effects, or the <feFuncX> elements in a component transfer)

Tarot, Chalkbeat's social card generator, takes this approach a little further. I talk about this a little in the team blog post, but essentially each card design is defined as an HTML template file containing a series of custom elements, each of which represents a preset drawing instruction (text labels, colored rectangles, images, logos, that kind of thing). For example, a very simple template might be something like:

<vertical-spacer padding="20 0">

  <series-logo color="accent" x=".7" scale=".4"></series-logo>

  <vertical-stack dx="40" anchor="top" x=".4">

    <text-brush
      size="60"
      width=".5"
      padding="0 0 20"
      value="Insert quote text here."
      >Quotation</text-brush>

    <image-brush
      recolor="accent"
      src="./assets/Chalkline-teal-dark.png"
      align="left"
    ></image-brush>
    
  </vertical-stack>

  <logo-brush x=".70" color="text" align="top"></logo-brush>

</vertical-spacer>

<photo-brush width=".4"></photo-brush>

Each of the "brush" elements has its customization UI in its shadow DOM, plus a slot that lets its children show through. The app puts the template HTML into a form so the user can tweak it, and then it asks each of the top-level elements to render. Some of them, like the photo brush, are leaf nodes: they draw their image to the canvas and exit. But the wrapper elements, like the spacer and stack brushes, alter the drawing context and then ask each of their slotted elements to render with the updated configuration for the desired layout.

The result is a nice little domain-specific language for drawing to a canvas in a particular way. It's easy to write new layouts, or tweak the ones we already have. My editor already knows how to highlight the template, because it's just HTML. I can adjust coordinate values or brush nesting in the dev tools, and the app will automatically re-render. You could do this without slots and shadow DOM, but it would be a lot messier. Instead, the separation is clean: user-facing UI (i.e., private configuration state) is in shadow, drawing instructions are in the light.

Patchwork languages

I really started to see the wider potential of custom element DSLs when I was working on my synthesizer, which represents the WebAudio signal path using the DOM. Child elements feed their audio signal into their parents, on up the tree until they reach an output node. So the following code creates a muted sine wave, piping the oscillator tone through a low-pass filter:

<audio-out>
  <fx-filter type="lowpass">
    <source-osc frequency=440></source-osc>
  </fx-filter>
</audio-out>

The whole point of a rack synthesizer is that you can rearrange it by running patch cords between various inputs and outputs. By using slots, these components effectively work the same way: if you drag the oscillator out of the filter in the inspector, the old and new parents are notified via slotchange and they update the audio graph accordingly so that the sine wave no longer runs through the lowpass. The dev tools are basically the patchbay for the synth, which was a cool way to give it a UI without actually writing any visual code.

Okay, you say, but in a Eurorack synthesizer, signals aren't just used for audible sound: the same outputs can be used as control voltage, say to trigger an envelope or sweep a frequency. WebAudio basically replicates this with parameter inputs that accept the same connections as regular audio nodes. All I needed to do to expose this to the document was provide named slots in components:

<fx-filter frequency=200>
  <fx-gain gain=50 slot=frequency>
    <source-osc frequency=1></source-osc>
  </fx-gain>
  <source-osc frequency=440></source-osc>
</fx-filter>

Here we have a similar setup as before, where a 440Hz tone is fed into a filter, but there's an additional input: the <fx-gain> is feeding a control signal with a range of -50 to 50 into the filter's frequency parameter once per second. The building blocks are the same no matter where we're routing a signal, and the code for handling parameter inputs ends up being surprisingly concise since it's able to lean on the primitives that slots provide for us.

The Mask of the Demon

In photography and cinema, the term "chiaroscuro" refers to the interplay and contrast between light and dark — Mario Bava's Black Sunday is one of my favorite examples, with its inky black hallways and innovative color masking effects. I think of the shadow DOM the same way: it's not a replacement for the light DOM, but a complement that can be used to give it structure.

As someone who loves to inject metaphor into code, this kind of thing is really satisfying. By combining slots, shadow DOM, and markup patterns, we can embed a language in HTML that produces either abstract data structures, user interface, or both. Without adding any browser plugins, we're able to manipulate this tree just using the dev tools, so we can easily experiment with our application, and it's compatible with our existing editor tooling too.

Part of the advantage of custom elements is that they have a lower usage floor: they do really well at replacing the kinds of widgets that jQueryUI and Bootstrap used to provide, which don't by themselves justify a full single-page app architecture. This makes them more accessible to the kinds of people that React has spent years alienating with JS-first solutions — and by that, I mean designers, or people who primarily use the kinds of HTML/CSS skills that have been gendered as feminine and categorized as "lesser" parts of the web stack.

So I understand why, for that audience, the current focus is on custom elements that primarily use the light DOM: after all, I started using custom elements in 2014, and it took six more years before I was comfortable with adding shadow DOM. But it's worth digging a little deeper. Shadow DOM and slots are some of my favorite parts of the web component API now, because of the way that they open up HTML as not just a presentational toolkit, but also as an abstraction for expressing myself and structuring my code in a language that's accessible to a much broader range of people.

March 21, 2023

Filed under: tech»mobile

Enthusiasm Gap

If I had to guess, I'd say the last time there was genuine grassroots mania for "apps" as a general concept was probably around 2014, a last-gasp burst of energy that coincides with a boom of the "sharing economy" before it became clear the whole thing was just sparkling exploitation. For a certain kind of person, specifically people who are really deeply invested in having a personal favorite software company, this was a frustrating state of affairs. If you can't count the apps on each side, how can you win?

Then Twitter started its slow motion implosion, and Mastodon became the beneficiary of the exodus of users, and suddenly last month there was a chance for people to, I don't know, get real snotty about tab animations or something again. This ate up like a week of tech punditry, and lived rent-free in my head for a couple of days.

It took me a little while to figure out why I found this entire cycle so frustrating, other than just general weariness with the key players and the explicit "people who use Android are just inherently tasteless" attitude, until I read this post by game dev Liz Ryerson about GDC, and specifically the conference's Experimental Games Workshop session. Ryerson notes the ways that commercialization in indie games led to a proliferation of "one clever mechanic" platformers at EGW, and an emphasis on polish and respectability — what she calls "portfolio-core" — in service of a commercial ideology that pushed quirkier, more personal titles out:

there is a danger here where a handful of successful indie developers who can leap over this invisible standard of respectability are able to make the jump into the broader industry and a lot of others are expected not to commercialize their work that looks less 'expensive' or else face hostility and disinterest. this would in a way replicate the situation that the commercial indie boom came out of in the 2000's.

however there is also an (i'd argue) even bigger danger here: in a landscape where so many niche indie developers are making moves to sell their work, the kind of audience of children and teenagers that flocked to the flash games and free web games that drove the earlier indie boom will not be able to engage with this culture at large anymore because of its price tag. as such, they'll be instead sucked into the ecosystem of free-to-play games and 'UGC' platforms like Roblox owned by very large corporate entities. this could effectively destroy the influence and social power that games like Yume Nikki have acquired that have driven organic fan communities and hobbyist development, and replace them with a handful of different online ecosystems that are basically 'company towns' for the corporations who own them. and that's not a good recipe if you want to create a space that broadly advocates for the preservation and celebration of art as a whole.

It's worth noting that the blog post that kicked off the design conversation refers to a specific category of "enthusiast" apps. This doesn't seem to be an actual term in common use anywhere — searching for this provides no prior art, except in the vein of "apps for car enthusiasts" — and I suspect that it's largely used as a way of excluding the vast majority of software people actually use on mobile: cross-platform applications written by large corporations, which are largely identical across operating systems. And of course, there's plenty of shovelware in any storefront. So if you want to cast broad aspersions across a userbase, you have to artificially restrict what you're talking about in a vaguely authoritative way to make sure you can cherry-pick your examples effectively.

In many ways, this distinction parallels the distinction Ryerson is drawing, between the California ideology game devs that focus on polish and "finish your game" advice, and (to be frank) the weirdos, like Stephen "thecatamites" Gillmurphy or Michael Brough, designers infamous for creating great games that are "too ugly" to sell units. It's the idea that a piece of software is valuable primarily because it is a artifact that reminds you, when you use it, that you spent money to do so.

Of course, it's not clear that the current pace of high-definition, expansive scope in game development is sustainable, either: it requires grinding up huge amounts of human capital (including contract labor in developing countries) and wild degrees of investment, with no guarantee that the result will satisfy the investor class that funded it. And now you want to require every little trivial smartphone app have that level of detail? In this economy?

To be fair, I'm not the target audience for that argument. I write a lot of my own software. I like a lot of it, and some of it even sparks joy, but not I suspect in the way that the "enthusiast app" critics are trying to evoke. Sometimes it's an inside joke for an audience of one. Maybe I remember having a good time getting something to work, and it's satisfying to use it as a result. In some cases (and really, social media networks should be a prime example of this), the software is not the point so much as what it lets me read or listen to or post. Being a "good product" is not the sum total through which I view this experience.

(I would actually argue that I would rather have slightly worse products if it meant, for example, that I didn't live in a surveillance culture filled with smooth, frictionless, disposable objects headed to a landfill and/or the bottom of the rapidly rising oceans.)

Part of the reason that the California ideology is so corrosive is because it can dangle a reward in front of anything. Even now, when I work on silly projects for myself, I find myself writing elaborate README files or thinking about how to publish to a package manager — polish that software, and maybe it'll be a big hit in the marketplace, even though that's actually the last thing I would honestly want. I am trying to unlearn these urges, to think of the things I write as art or expression, and not as future payday. It's hard.

But right now we are watching software companies tear themselves apart in a series of weird hype spasms, from NFTs to chatbots to incredibly ugly VR environments. It's an incredible time to be alive. I can't imagine anything more depressing than to look at Twitter's period of upheaval, an ugly transition from the worldwide embodiment of context collapse to smaller, (potentially) healthier communities, and to immediately ask "but how can I turn this into a divisive, snide comment?" Maybe I'm just not enough of an enthusiast to understand.

February 20, 2023

Filed under: tech»web

Build Less

When it comes to web development, I'm actually fairly traditional. By virtue of the kinds of apps I make (either bespoke visualizations for work or single-serving toys for personal use), I'm largely isolated from a lot of the pain of modern front-end web development. I don't use React, I don't need to scale servers, and I render my HTML the old-fashioned way, from string templates. Even so, my projects are usually built on top of a few build tools, including Rollup, Less, and various SDKs for moving data between different cloud providers.

However, for internal utilities and personal projects over the last few years, I've been experimenting with removing tools, and relying solely on the modern browser. So instead of bundling JS, I'm just loading modules with import statements. I write one CSS file for my light DOM, but custom properties have largely eliminated what I need a preprocessor to do (and the upcoming support for nesting will cover the rest). Add something like the Eleventy dev server for live reload, and it's actually a really pleasant experience.

It's one thing to go minimalist for a single-serving hobby app, or for people in the Chalkbeat newsroom who can reach me directly for support. It's another to do it for a general audience, where the developer/user ratio starts to tilt and your scale becomes more amibitious. But could we develop a real, public-facing web app that doesn't rely on a brittle and slow compilation step? Is a no-build deployment feasible?

While I'm optimistic, I have enough self-awareness to know that things are rarely as simple as I want them to be. I wasn't always a precious snowflake, and I've seen first-hand that national (or international) scale applications have support infrastructure for a reason. To that end, here's a non-exhaustive list of potential hurdles I believe developers will need to jump to get to that tooling-free future.

Caching and coherency

In theory, HTTP2 (which reuses connections and parallelizes transfers) means that we don't pay a penalty for deploying our JavaScript as individual modules instead of a single bundled file. But it raises a new issue that we didn't have with those big bundles: what happens when we make a breaking change in part of the application, and someone visits it with a partially-primed cache, so they have some old files still hanging around? How do we make sure that we can take advantage of caching appropriately, while still keeping our code coherent for a given deployment?

Imagine we have a page that loads module A, which loads B and C, and is styled using CSS file D. I update file B, and changes to D are required for the new components. Different files may be evicted from the browser cache in unpredictable ways, though. Ideally, A and C should be loaded from the cache, and B and D should be fresh requests. If everything comes from the cache, users won't see new features, but ideally nothing should be immediately broken. It would be wasteful, but not disastrous, if all files are loaded fresh. The real problem comes if only one of B or D comes from the cache, so that we either get new code without the matching style changes, or styles without the new code.

As Jake Archibald notes, there are two working (and compatible) strategies for caching interrelated code: either long cache times with unique URLs, or no-cache headers and a shorter lifetime. I lean toward the latter strategy for now, probably using ETag hash-based headers for each file. Individual requests would be a little slower, since the browser would always check the server for individual files, but you'd only actually transfer new code, which is the expensive part (cache hits would return 304 Not Modified). Based on my experience with a similar system for election data updates, I think this would probably scale pretty well, but you'd need to test to be sure.

Once import maps are supported in all evergreen browsers, the hashed URL solution becomes the simpler of the two. Use short identifiers for all your import statements (say, based from the project root), and then hash their contents and generate a JSON mapping between the original path and the mangled filename for production deployments. Now the initial page load can be revalidated on every load, but the scripts that go with that particular page version will be immutable, guaranteeing that any change means a new URL and no cache conflicts. Here's hoping Safari ships import maps to users soon.

Vendor code

Personally, the whole point of developing things in a no-build environment is that I don't need to learn, manage, and optimize around third-party libraries. The web platform is far from perfect, but it's fast and accessible, and there's an undeniable pleasure in writing every line of code. I'm lucky that I have that opportunity.

Most teams are not lucky, and need to load libraries written by other people. Package managers mean we have a wealth of code at our fingertips. But at the same time, the import patterns that work well for Node (lots of modules in a big, deep folder hierarchy) have proven a clumsy match for the front-end. Importing files from node_modules is clumsy and painful, especially if you're also loading stylesheets and other non-JavaScript assets. In fact, much of the tooling explosion (including innovations like tree-shaking and transpilation) comes from trying to have our cake from npm and eat it too.

So loading from the same package manager as the server-side code is frustrating, and using a CDN requires us to trust a remote host completely (plus introducing another DNS/TCP handshake) into our performance waterfall. The ideal would be a shallow set of third-party modules that are colocated with our front-end code, similar to how Bower (RIP) used to handle libraries. Sadly, there are few tools or code conventions that I'm aware of now specifically for that niche anymore.

One approach that I'm intrigued by is Deno's bundle command, which generates an importable module file from an URL, including all its dependencies. Using a tool like this, you could pretty easily zip up vendor code into a single file in the equivalent of src/bower_components. You'd also have a lot more visibility into just how big those third-party libraries are when they're packed up into self-contained (absolute) units, which might provoke a little reflection. Maybe you don't need 3MB of time zone data after all.

That said, one secret weapon for managing those chunky libraries is asynchronous import(). Whereas code-splitting in a bundler is a complicated and niche process, when we use ES modules natively our code is effectively pre-split, and the browser gives us a mechanism to only request libraries when we need them. This means the cost equation for vendor code can change somewhat: maybe it's not great that a given component is multiple megabytes of script, but if users only pay the cost for that transfer and compilation when they're actually going to use it, that's a substantial improvement over the current state of affairs.

CSS imports

I've worked on some large projects where we had a single, unprocessed CSS file for the product. It was hard to stay disciplined. Without nesting or external constraints, we'd end up duplicating styles in different parts of the document and worrying about breakages if we needed to change something. The team tried to keep things well-structured, but you know how it is: if you've got six programmers, you have 12 different ideas about how the site should be organized.

CSS has @import for natively splitting styles into multiple files, but historically it hasn't been considered good for performance. Imports block the renderer and parser, meaning that you may be halting page load for the header while you wait for footer styles. We still want multiple small files on HTTP2, so the best practice is still to generate lots of <link> tags for CSS, possibly using tricks to unblock the parser. Luckily, at least, CSS is not load-order dependent the way that JavaScript is, and the @layer rule gives us ways to manage the cascade. But manually appending a tag for every stylesheet doesn't feel very ergonomic.

I don't have a good solution here. It's possible this is not as serious a problem as I think it is — certainly on my own projects, I'm able to move localized styles into shadow DOM and load them as a part of the component registration, so it tends to solve itself for anything that's heavily interactive or component-based. But I wish @import had the kind of ergonomics and care that its JavaScript counterpart did, and I suspect teams will find PostCSS easier to use than the no-build alternative.

HTML partials and templating

What's the ultimate point of eschewing build tools? Sure, on some level it's to avoid ever touching webpack.config.js (a.k.a. the Lament Configuration) ever again. But it's also about trying to claw our way back from a front-end culture that has neglected the majority of users. And the best way to address an audience on typical devices (read: an Android phone with meager single-core performance and a spotty network connection, or a desktop PC from 2016) is to send less JavaScript and more HTML and CSS.

Last week, I loaded a page from a local news outlet for work, which included data on a subset of Chicago schools. There was no dynamic content, although it did have an autocomplete search at the top. I noticed the browser tab was stuttering on load, so I looked in the dev tools: each of the 600+ schools was being individually templated and appended to a queried element from a JSON fetch. On a fairly new desktop PC, it froze the UI thread for more than half a second. On a phone, that was more like 7 seconds, even with ads blocked, and any news dev will tell you that the absolute easiest way to boost your story's load performance is to remove the ads from it.

If that page had been built as static HTML, it would parse and load almost instantly by comparison. Indeed, in the newsroom projects that I maintain, the most important feature is the ability to pull in data from a variety of sources (Google Docs, Sheets, local text and CSV, JSON, remote APIs) and merge that easily with the HTML template. The build scripts do other things, like bundling and CSS processing and deployment. But those things could be replaced, or reduced, or moved into other tools without radically changing the experience. HTML generation is irreplaceable.

At a bare minimum, let's say I want to be able to include partial templates (for sharing headers and snippets between pages), loop through some data, and inject my import map or my stylesheet collection into the page. Here's a list of the tools that let me do that easily, on most Linux servers, without installing a bunch of extra crap:

  • PHP

Listen, no shade on PHP, but I don't want to write it for a living anymore even if I wasn't working off of static file storage. It's a hard sell, especially in the context of "a modern web stack."

HTML templating is where the rubber really meets the road. We do not have capabilities for meta-processing in the language itself, and any solution that involves JavaScript (including the late, lamented HTML imports) is a non-starter. I'd kill for an <include> tag, especially if there were a way to use it without blocking the parser, similar to the way that declarative shadow DOM provides declarative support for component subtrees.

What I'm not interested in doing is stripping the build toolchain down if it means a worse experience for users. And once I need some kind of infrastructure to assemble my HTML, it's not actually that much more work to bolt on a script bundler and a stylesheet preprocessor, and reap the benefits from those ecosystems. I'm all-in on the web platform, but I'm not a masochist.

Let's build

This is by no means an exhaustive list of challenges, but Nano tells me I'm well past 200 lines in this text document, so let's wrap it up.

The good news, as I see it, is that the browser is in a healthier place than ever for hobbyists, students, and small project developers. You can open index.html, import Lit or Vue from a CDN, and have a reasonably performant front-end environment that can be grow more complex to fit your needs and skills. You can also write a lot less JavaScript than in years past, because CSS has gotten so much better for layout and interaction.

I'd say we're within reach of a significantly less complicated front-end technical culture. I would not be surprised to see companies start to experiment with serving JavaScript or CSS directly, using tooling to smooth off the rough edges (e.g., producing import maps or automating stylesheet inclusion) rather than leaning hard into full, slow-moving compilation steps. The ergonomics of these approaches are going to be better than a lot of people expect. Some front-end teams that have specialized in tooling-intensive ecosystems are going to either eat a lot of crow or get very angry for a while.

All that said: we're not going back to the days when all you needed was notepad.exe and some moxy to make a "real" website. Perhaps it's naive to think we ever were. But making a good web app is hard, I would argue harder than many other kinds of programming. It's the code you write in a trio of languages, but also the network between you and the user, the management of distributed state, and a vast range of devices, inputs, and outputs. The least we can do is make it less wearying to get started.

June 19, 2022

Filed under: tech»coding

The Many-Threaded Hydra

The Emperor had set out to beat not just Gurgeh, but the whole Culture. There was no other way to describe his use of pieces, territory and cards; he had set up his whole side of the match as an Empire, the very image of Azad.

Another revelation struck Gurgeh with a force almost as great; one reading — perhaps the best — of the way he'd always played was that he played as the Culture. He'd habitually set up something like the society itself when he constructed his positions and deployed his pieces; a net, a grid of forces and relationships, without any obvious hierarchy or entrenched leadership, and initially quite peaceful.

[...] Every other player he'd competed against had unwittingly tried to adjust to this novel style in its own terms, and comprehensively failed. Nicosar was trying no such thing. He'd gone the other way, and made the board his Empire, complete and exact in every structural detail to the limits of definition the game's scale imposed.

Iain M. Banks' classic novel Player of Games follows Jernau Morat Gurgeh, who is sent from the Culture (a socialist utopia that's the standard setting for most of Banks' genre fiction) to compete in a rival society's civil service exam, which takes the form of a complicated wargame named Azad. The game is thought by its adherents to be so complex, so subtle, that it serves as an effective mirror for the empire itself.

Azad is, obviously, not real — it's a thought experiment, a clever dramatic conceit along the lines of Borges' famous 1:1 scale map. But we have our own Azad, in a way: as programmers, it's our job to create systems of rules and interactions that model a problem. Often this means we intentionally mimic real-world details in our code. And sometimes it may mean that we also echo more subtle values and viewpoints.

I started thinking about this a while back, after reading about how some people think about the influences on their coding style. I do think I have a tendency to lean into "playful" or expressive JavaScript features, but that's just a symptom of a low boredom threshold. Instead, looking back on it, what struck me most about my old repos was a habitual use of what we could charitably call "collaborative" architecture.

Take Caret, for example: while there are components that own large chunks of functionality, there's no central "manager" for the application or hierarchy of control. Instead, it's built around a pub/sub command bus, where modules coordinating through broadcasts of custom events. It's not doctrinaire about it — there's still lots of places where modules call into each other directly (probably too many, actually) — but for the most part Caret is less like a modern component tree, and more like a running conversation between equal actors.

I've been using variations on this design for a long time: the first time I remember employing it is the (now defunct) economic indicator dashboard I built for CQ, which needed to coordinate filters and views between multiple panels. But you can also see it in the NPR primary election rig, Weir's new UI, and Chalkbeat's social media card generator, among others. None of these have what what we would typically think of as a typical framework "inversion of control." I've certainly built more traditional, framework-first applications, but it's pretty obvious where my mind goes if given free rein.

(I suspect this is why I've taken so strongly to web components as a toolkit: because they provide hooks for managing their own lifecycle, as well as direct connection to the existing event system of the DOM, they already work in ways that are strongly compatible with how I naturally structure code. There's no cost of convenience for me there.)

There are good technical reasons for preferring a pub/sub architecture: it maps nicely onto the underlying browser platform, it can grow organically without having to plan out a UML diagram, and it's conceptually easy to understand (even if you don't just subclass EventTarget, you can implement the core command bus in five minutes for a new project). But I also wondered if there are non-technical reasons that I've been drawn to it — if it's part of my personal Azad/Culture strategy.

I'm also asking this in a very different environment than even ten years ago, when we used to see coyly neo-feudalist projects like Urbit gloss over their political design with a thick coat of irony. These days, the misnamed "web3" movement is explicit about its embrace of the Californian ideology: not just architecture that exists inside of capitalism, but architecture as capitalism, with predictable results. In 2022, it's not quite so kooky to say that code is cultural.

I first read Rediker and Linebaugh's The Many-Headed Hydra: Sailors, Slaves, Commoners, and the Hidden History of the Revolutionary Atlantic in college, which introduced me to the concept of hydrarchy: a type of anarchism formed by the "motley crew" of pirate ships in contrast to the strict class structures of merchant companies. Although they still had captains who issued orders, that leadership as not absolute or unaccountable, and it was common practice for pirates to put captured ship captains at the mercy of their crews as a taste of hydrarchy. A share system also meant that spoils were distributed more equally than was the case on merchant ships.

The hydrarchy was a huge influence on me politically, and it still shapes the way I manage teams and projects. But is it possible that it also influenced the ways I tend to think about and write code systems? This is a silly question, but not I think a stupid one: a little introspection can be valuable, especially if it provides insight in how to explain our work to beginners or accommodate their own subconscious worldviews.

This is not to say that, for example, Caret is an endorsement of piracy, or even a direct analog (certainly not in the way that web3 is tied to venture capitalism). But it was built the way it was because of who did the building. And its design did have cultural implications: building on top of events means that you could write a Caret plugin just by sending messages to its Chrome process, including commands for the Ace editor. The promise (not always kept, to be fair) was that your external code was using the same APIs that I used internally — that you were a collaborator with the editor itself. You had, as it were, an equal share in the outcome.

As we think about what the "next era of JavaScript" looks like, there's a tendency to express it in terms of platforms and layers. This isn't wrong! But if we're out here dreaming up new workflows empowered by edge computing, I think we can also spare a little whimsy for models beyond "pure render functions" or "strict hierarchy of control," and a little soul-searching about what those models for the next era might mean about our own mindsets.

March 31, 2022

Filed under: tech»open_source

CTRL alt QMK

Like a lot of people during the pandemic, early last year I got into mechanical keyboard collecting. Once you start, it's an easy hobby to sink a lot of time and money into, but the saving grace is that it's also ridiculously inconvenient even before the supply chain imploded, since everything is a "group buy" or some other micro-production release, so it tends to be fairly self-limiting.

I started off with a Drop CTRL, which is a pretty basic mechanical that serves as a good starting point. Then I picked up a Keychron Q1, a really sharp budget board that convinced me I need more keys than a 75% layout, and finally a NovelKeys NK87 with Box Jade clicky switches, which is just just a lovely piece of hardware and what I'm using to type this.

All three of these keyboards are (very intentionally) compatible with the open-source QMK firmware. QMK is very cool, and ideally it means that any of these keyboards can be extended, customized, and updated in any way I want. For example, I have a toggle set up on each board that turns the middle of the layout into a number pad, for easier spreadsheet edits and 2FA inputs. That's the easy mode — if you really want to dig in and write some C, these keyboards run on ARM chips somewhere on the order of a Nintendo DS, so the sky's pretty much the limit.

That said, "compatible" is a broad term. Both the Q1 and NK87 have full QMK implementations, including support for VIA for live key-remapping and macros, but the CTRL (while technically built on QMK) is usually configured via a web service. It's mostly reliable, but there have been a few times in the last few months where the firmware I got back after remapping keys was buggy or unreliable, and this week I decided I wanted to skip the middleman and get QMK building for the CTRL, including custom lighting.

Well, it could have been easier, that's for sure. In getting the firmware working the way I wanted it, I ended up having to trawl through a bunch of source code and blog posts that always seemed to be missing something I needed. So I decided I'd write up the process I took, before I forget how it went, in case I needed it in the future or if someone else would find it helpful.

Building firmware

The QMK setup process is reasonably well documented--it's a Python package, mostly, wrapped around a compilation toolchain. It'll clone the repo for you and install a qmk command that manages the process. I set mine up on WSL and was up and running pretty quickly.

Once you have the basics going, you need to create a "keymap" variation for your board. In my case, I created a new folder at qmk_firmware/keyboards/massdrop/ctrl/keymaps/thomaswilburn. There are already a bunch of keymaps in there, which is one of the things that gives QMK a kind of ramshackle feel, since they're just additions by randos who had a layout that they like and now everyone gets a copy. Poking around these can be helpful, but they're often either baroque or hyperspecialized (one of them enables the ability to programmatically trigger individual lights from terminal scripts, for example).

However, the neat thing about QMK's setup is that the files in each keymap directory are loaded as "overrides" for the main code. That means you only need to add the files that change for your particular use, and in most cases that means you only need keymap.c and maybe rules.mk. In my case, I copied the default_md folder as the starting place for my setup, which only contains those files. Once that's done, you should be able to test that it builds by running qmk compile -kb massdrop/ctrl -km thomaswilburn (or whatever your folder was named).

Once you have a firmware file, you can send it to the keyboard by using the reset button on the bottom of the board and running Drop's mdloader utility.

Remapping

QMK is designed around the concept of layers, which are arrays of layout config stacked on top of each other. If you're on layer #3 and you press X, the firmware checks its config to see if there's a defined code it should send for that physical key on that layer. QMK can also have a slot defined as "transparent," which means that if there's not a code assigned on the current layer, it will check the next one down, until it runs out. So, for example, my "number pad" layer defines U as 4, I as 5, and so on, but most of the keys are transparent, so pressing Home or End will fall through and do the right thing, which saves time having to duplicate all the basic keys across layers.

If your board supports VIA, remapping the layer assignments is easy to do in software, and your keymap file will just contain mostly empty layers. But since the CTRL doesn't support VIA, you have to assign them manually in C code. Luckily, the default keymap has the basics all set up, as well as a template for an all-transparent layer that you can just copy and paste to add new ones. You can see my layer assignments here. The _______ spaces are transparent, and XXXXXXX means "do nothing."

There's a full list of keycodes in the QMK docs, including a list of their OS compatibility (MacOS, for example, has a weird relationship with things like "number lock"). Particularly interesting to me are some of the combos, such as LT(3, KC_CAPS), which means "switch to layer three if held, but toggle caps lock if tapped." I'm not big on baroque chord combinations, but you can make the extended functions a lot more convenient by taking advantage of these special layer behaviors.

Ultimately, my layers are pretty straightforward: layer 0 is the standard keyboard functions. Layer 1 is fully transparent, and is just used to easily toggle the lighting effects off and on. Layer 2 is number pad mode, and Layer 3 triggers special keyboard behaviors, like changing the animation pattern or putting it into "firmware flash" mode.

Lighting

Getting the firmware compiling was pretty easy, but for some reason I could not get the LED lighting configuration to work. It turns out that there was a pretty silly answer for this. We'll come back to it. First, we should talk about how lights are set up on the CTRL.

There are 119 LEDs on the CTRL board: 87 for the keys, and then 32 in a ring around the edges to provide underglow. These are addressed in the QMK keymap file using a legacy system that newer keyboards eschew, I think because it was easier for Drop to build their web config tool around the older syntax. I like the new setup, which lets you explicitly specify ranges in a human-readable way, but the Drop method isn't that much more difficult.

Essentially, the keymap file should set up an array called led_instructions filled with C structs configuring the LED system, which you can see in my file here. If you don't write a lot of C, the notation for the array may be unfamiliar, but these unordered structs aren't too difficult from, say, JavaScript objects, except that the property names have to start with a dot. Each one gets evaluated in turn for each LED, and a set of flags tells QMK what conditions it requires to activate and what it does. These flags are:

  • LED_FLAG_USE_PATTERN - indicates that you're going to set a specific pattern by index from the set of different animations that the CTRL ships by default. For example, .pattern = 3 should activate the teal/salmon gradient.
  • LED_FLAG_USE_ROTATE_PATTERN - indicates that you want to use the user-selectable pattern, which the user can switch between using hotkeys.
  • LED_FLAG_USE_RGB - indicates that instead of using a preset color or pattern, you'll provide custom RGB values for the LEDs.
  • LED_FLAG_MATCH_LAYER - will only apply this lighting when the current layer matches the provided index.
  • LED_FLAG_MATCH_ID - will only apply this lighting to LEDs matching an ID bitmask.
Combining these gives you a lot of flexibility. For example, let's say I want to light up the keys in the "number pad" (7-9, U-O, J-L, and M-period) in bright green when layer #2 is active. For that case, the struct looks something like this:
{
  .flags = LED_FLAG_MATCH_LAYER | 
    LED_FLAG_USE_RGB | 
    LED_FLAG_MATCH_ID,
  .g = 255, 
  .id0 = 0x03800000,
  .id1 = 0x0E000700,
  .id2 = 0xFF8001C0,
  .id3 = 0x00FFFFFF,
  .layer = 2
},
The flags mean that this will only apply when the active layer matches the .layer property, we're going to provide color byte values (just .g in this case, since the red and blue values are both zero), and only LEDs matching the bitmask in .id0 through .id3 will be affected.

Most of this is human-readable, but those IDs are a pain. They are effectively a bitmask of four 32-bit integers, where each bit corresponds to an LED on the board, starting from the escape key (id 0) and moving left-to-right through each row until you get to the right arrow in the bottom-right of the keyboard (id 86), and then proceeding clockwise all around the edge of the keyboard. So for example, to turn the leftmost keys on the keyboard, you'd take their IDs (0 for escape, 16 for `, 35 for tab, 50 for capslock, 63 for left shift, and 76 for left control), divide by 32 to find out which .idX value you want, and then modulo 32 to set the correct bit within that integer (in this case, the result is 0x00010001 0x80040002 0x00001000). That's not fun!

Other people who have done this have used a Python script that requires you to manually input the LED numbers, but I'm a web developer. So I wrote a quick GUI for creating the IDs for a given lighting pattern: click to toggle a key, and when the diagram is focused you can also press physical keys on your keyboard to quickly flip them off and on. The input contains the four ID integers that the CTRL expects when using the LED_FLAG_MATCH_ID option.

Using this utility script, it was easy to set up a few LED zones in a Vilebloom theme that, for me, evokes the classic PDP/11 console. But as I mentioned before, when I first started configuring the LED system, I couldn't get anything to show up. Everything compiled and loaded, and layers worked, but no lights appeared.

What I eventually realized, to my chagrin, was that the brightness was turned all the way down. Self-compiled QMK tries to load settings from persistent memory, including the active LED pattern and brightness, but I suspect the Drop firmware doesn't save them, so those addresses were zero. After I used the function keys to increase the backlight intensity, everything worked great.

In review

As a starter kit, the CTRL is pretty good. It's light but solidly constructed with an aluminum case, relatively inexpensive, and it has a second USB-C port if you want to daisy-chain something else off it. It's a good option if you want to play around with some different switch options (I added Halo Clears, which are pingy but have the same satisfying snap as that one Nokia phone from The Matrix).

It's also weirdly power-hungry, the integrated plate means it's stiff and hard to dampen acoustically, it only takes 3-prong switches, and Drop's software engineering seems to be stretched a little thin. So it's definitely a keyboard that you can grow beyond. But I'm glad I put the time into getting the actual open source firmware working — at the very least, it can be a fun board for experimenting with layouts and effects. And if you're hoping to stretch it a little further than its budget roots, I hope the above information is useful.

September 21, 2021

Filed under: tech»coding

I am FM

My last day at NPR was September 3, and I started at Chalkbeat on September 13. In the nine days in between, I tried to detox: I stayed away from the news, played a lot of Castlevania, and — in an effort to not feel completely useless — worked on a project I've been meaning to tackle for a while: I wrote a browser-based FM synth modeled on the classic Yamaha DX-7.

The DX-7 is the classic Lament Configuration of digital sound design. It's not only based on a model of synthesis that's unintuitive, but Yamaha wrapped it in a pushbutton user interface that discourages experimentation. Its sound defined an era almost entirely through the presets: the piano arpeggios from Twin Peaks, countless Whitney Houston ballads, and the bass line from Take on Me. I never owned a genuine DX-7, but I had one of Yamaha's budget models, and learning to build sounds on it was a long-standing white whale of mine.

Modulation Operations

Most synthesizers are what we call additive and subtractive. You generate a waveform, either by combining different wave shapes (sine, rectangle, sawtooth, or triangle) or using a noise generator, and then patch that through a series of filters and effects, and out the other end either emerges a transcendant reinterpretation of Bach (if you're Wendy Carlos) or a kind of deranged squawking (if you're me). This kind of synthesis isn't easy, per se, but it makes sense to someone who has used, say, a guitar pedalboard.

The DX-7 works differently, using something called frequency modulation (FM) synthesis. Essentially, it uses up to six sine wave oscillator units (called "operators"), but most of them aren't audible at any given time. Instead, the secondary units (called "modulators") are used to tweak the frequency of the audible operators ("carriers"). When the wave output of the modulator goes up, so does the frequency of its carrier. When the wave goes down, the freqency dips. Since these changes in frequency happen many times a second, and are often scaled to the input pitch from the keyboard, the result are complicated harmonic patterns, often described as metallic, bell-like, percussive.

In the original DX-7 hardware, this is all done using a polynomial math equation, effectively shifting the sample location for the carrier wave based on the modulator value (there's a useful .gif at Wikipedia illustrating the principle). You can do this with relatively cheap processors, and indeed that's how most of the JavaScript implementations still do it: they generate a stream of audio data directly from the phase math, and pipe that to an output. But for the sake of prototyping, I decided to do it a different way, using the native WebAudio processing graph.

Adapting theory to practice

WebAudio is a kind of beautiful monstrosity. It's easy to imagine an API designer deciding that browser audio should be mostly WebGL-style primitives, basically just handing you an audio buffer array and leaving the sound generation up to you. Or the pendulum could have swung the other way, toward extreme user-friendliness, with just a slightly more performant version of the <audio> tag letting you load and trigger preset clips.

Instead, the final API ends up looking more akin to a classic Moog patchbay or a studio effects rack, letting you wire various modules together into a complex signal chain. Those nodes start out as simple oscillators and gain amplifiers, but from there it gets pretty batteries-included: nodes for impulse convolution, multiple shaped filters, and compression, plus a custom "script worker" node as an escape hatch.

Crucially for my purposes, WebAudio signal nodes can be wired to more than just audio inputs and outputs. You can also hook nodes into the control parameters, so that the output from one changes the volume or strength of another. In our case, we can use the audio signal from our modulators and pipe it into the frequency value of our carriers. It's not quite the same as the classic DX-7 formula, but it performs very well, and the sound actually isn't that far off. You can hear the classic EPIANO1 preset adapted to my code on the GitHub demo page for the project.

However, while this implementation felt more intuitive than juggling Math.sin(), WebAudio also has some quirks that made it tricky. For example, oscillators are single-shot: they can only be started and stopped once. The API is full of this kind of design, where you're supposed to create nodes, connect them to the graph, and then throw them away. But when you have modulator oscillators feeding into carrier oscillators in a complicated web of amplifiers and filters, disposable audio sources don't really fit the design.

In the end, I had to wrap the whole thing in a disposable Voice class that encapsulates an arrangement of operators for a single note. When the synth is asked to play a sound, it creates a Voice containing a fresh set of operators, hooks that up to the audio context, and sends it on its merry way. This effectively makes our synthesizer polyphonic by default, since each individual frequency gets its own voice on demand. It feels wasteful, but it works.

Gradual complexity

Working on a project like this makes me think a lot about how it is that I build projects, and how to teach others to do the same. We often tell junior developers that they should learn by creating something fairly complex, but we don't really tell them how to do it. I suspect this is because it becomes fairly instinctual over time, so it's hard to explain.

Part of what we don't tell junior developers is that big projects are built out of little projects, one level of abstraction at a time. For example, for the Hello Operator repo, the process of getting a (mostly) working synthesizer looks like:

  1. Hook up some basic oscillators and trigger them on a timer
  2. Wrap those oscillators in an Operator class and connect them together
  3. Instead of using a timer, set the keyboard to trigger playback
  4. Wrap the Operator objects in a Voice, so that they can be played repeatedly
  5. Add a MIDI keyboard and feed its input directly into the synth
  6. Wrap MIDI in an EventTarget so it can be used for more than just notes
  7. Add basic inputs that tweak the Operator settings, wired directly to MIDI
  8. Create a bad abstraction to marry browser UI to the operator settings across multiple parameters
  9. Replace that abstraction with something that handles updates regardless of source, whether from the browser UI or the knobs on the MIDI controller

I suspect that when we say "build bigger projects," what people hear is that their application needs to spring fully-formed from their head like Athena, but literally nothing I've ever built has been scoped that way. It's always been a gradual accretion of functionality. Caret, for example, started out as just a text box and a keyboard input, and everything else, from tabs to project management, grew from there.

It's not that I don't have a plan at all — I knew from the start, for example, that I'd want a solid system that encapsulated the MIDI handling code and turned it into something more JavaScript-friendly — but the point of experience is learning where to put the grotesque hacks that you'll later replace with those better systems. And you get that experience by failing to make good placeholders on your first few projects.

Did I accomplish my goal of learning to program a DX-7? No. But ultimately, for these kinds of projects, that's not really the point. I learned a lot about sound, how the browser processes it, and how to handle new kinds of input. One day, I might even finish it. Brian Eno, eat your heart out.

August 3, 2021

Filed under: tech»web

The Mythical Document Web

Through a confluence of issues, Safari (Apple's web browser, and the only browser allowed on iOS) has been a hot topic lately:

  • Multiple game streaming services have rolled out in the browser instead of through a centralized app store on iPhones, including Microsoft's Xbox Cloud. These are probably the highest profile web-only apps on iOS in years, and ironically Safari only recently became capable of hosting them.
  • iOS 14.1.1 shipped with a showstopper bug in the IndexedDB API, part of a long stream of bugs that break Safari's ability to store data locally and work offline. Because browser releases are tied to the OS, developers will have to work around this for at least half a year and probably more (since many users don't upgrade promptly).
  • The Safari team asked for feedback about what new features developers would like to prioritize, which reminded everyone that the existing features are largely broken and it's part of a systemic pattern of neglect and abuse. Lord knows I have my own collection of horror stories.

When these kinds of teapot tempests stir up, you can often sort the reaction from the technical community into a few buckets. At the extreme "actually, Safari is good" side, there are people who argue that the web should be replaced or downgraded into something more like Gemini, or restricted to the feature set of HTML 4 and CSS 2 (no scripting allowed). You know: cranks.

But you'll also see a second group proposing that "browsers should be for documents, not for apps" (e.g. browser developers should just stop adding new features entirely and let's split the web in two). In this line of thinking, a browser like Safari that refuses (or is slow) to implement new APIs or features is doing the world a service, because it keeps the ecosystem tilted toward the "document" side instead of the "app" platform side, where Google has too much influence. These opinions seem more reasonable on the surface, but they're also cranks — it's just harder to explain why.

The flaw in the "document browsers, not app platforms" argument is that it assumes that web APIs can be sorted into clear, easily distinguished buckets — or indeed, that there's a bright line between the two. In fact, as someone who almost entirely builds content pages (jargon about "news apps" aside), I often find that in conversations with "app" developers that I'm more experienced with new browser APIs than they are. Most client-side apps, like GMail or Trello, do not actually use that much of a browser's API surface. Even really ambitious applications like Figma mostly just need methods for storage and display, and they've had those (through IndexedDB and canvas) for at least a decade now.

Should browsers be simpler and easier to implement? This kind of argument often feels very intuitive to the "document web" advocates, because they're used to thinking about new APIs through the context of the marketing bullet points for a new operating system. But when you actually look over a list at Can I Use, an awful lot of the "new" APIs are just paving cowpaths: they're designed to replace or reduce common patterns that developers were already hacking onto pages.

  • Beacon API - lets you fire a request at a server without waiting for a response, which means that developers can stop intercepting link clicks and pausing navigation while they send an analytics ping.
  • Fetch - makes it easier to safely load information from a server, replacing XMLHttpRequest (which was hard to use) and JSONP (which was a security nightmare).
  • Intersection Observer - lets developers know when an element has entered or exited the visual viewport without having to poll constantly, which means scrolling gets smoother.
  • Web Crypto - keeps people from shipping huge crypto libraries as a part of their JS bundle, and supports privacy-first features like end-to-end encryption.
  • Web Assembly - creates a stable compilation target for other languages. Developers were already creating other languages that compile to JavaScript, Web Assembly just creates a standard interface and a predictable performance profile.
  • Web Sockets - replaces previous methods of getting fast updates on events, such as constant polling requests or persistent server connections that would take down Apache.
  • Various message channels - lets developers communicate between tabs without abusing sidechannels like window.name or local storage, useful for all the people who have GMail open in seven tabs because they never close anything.
  • Grid and flex layouts - replaces various hacks and JavaScript-based layout systems, including the holy grail: vertically-centered content.

Because JavaScript is a Turing-complete language and web browsers were originally designed with lots of holes in them, none of these APIs are really adding anything new to the browser — it's just that previously, this functionality would have been added by brute force. For example, before browsers created consistent ways to autoplay video without loading a large and dithered .gif file, there were scripts to "play" frames via canvas and a tiled .jpg. You'd be amazed the hacks like this I've seen (and some I've perpetrated).

Are there APIs in Chrome that cross into traditional native app territory? Sure, there's a few, like the Bluetooth or USB access APIs. But while pundits and native developers seem to think those are the vast majority of new browser features, I think it's clear from the listings (and my own experience) that those don't actually represent very much usage in modern apps (they're only about 1/10 of the items on the Can I Use index of JS APIs). They're certainly not what most people complaining about Safari are actually talking about.

What's especially jarring for me, as a visual journalist, is that the same people who rail against the complexity of the web platform will often praise the interactive stories from teams like mine. While I appreciate the support, I can't help but feel that they think our work is less technically challenging or innovative than a "real" developer's, and that they're happy to have a browser push the envelope only as long as it doesn't pose any competition to Apple's revenue stream.

In contrast, if you look at something like my parents' hometown paper (with an ad-blocker, of course), it's not far off from the "document web" ideal — and it looks unbelievably quaint. Despite the warm glow of nostalgia around "the old web" when men were men, browsers were small, and pages were laid out in tables, actually returning to that standard would feel like trying to use DOS for a day: clumsy, slow, and ugly.

That's why when someone says "browsers should be for pages, not for apps," we should ask specifically what they mean by that:

  • Do they mean physically handing Word files around, like we did before Google Docs? Can anyone imagine going back to a native office suite for any kind of collaboration?
  • Are slippy maps okay, or should we go back to the Mapquest experience of clicking a little arrow and waiting for the page to reload in order to see a little more to the east?
  • Do you want responsive charts in your news articles, so that they're legible on any device? Think of all the COVID explainers and election results from the past year — should all those have been rejected for being "too app-like?"
  • Should a person be able to check their e-mail from any computer, or should they have to install a dedicated native client and remember all their server details?
  • Think about all the infrequent tasks you do online, and now imagine that they're all either regressed to the 1998 version or built as native code. Do you think you should have to install an app just to book a flight? To buy a book? To find a new job?

(Incidentally, it's wild how much the mobile market has been distorted on these issues: I think most people would consider it a total non-starter to need to install a desktop app to read Facebook or stream a TV show, but Apple has worked very hard to protect their platform from browser-based options on mobile.)

I think it's possible for someone to look at that list and still insist that yes, they want browsers to be Gopher clients with slightly better font choices. I personally doubt it, though — I suspect most people making the case for a "document-first" web aren't irrational, they just like the romance of the idea and haven't fully thought it through. I sympathize! That doesn't mean we have to take them seriously.

Past - Present