this space intentionally left blank

December 28, 2022

Filed under: fiction»reviews

2022, Mediated: Books

There aren't, unlike in the games roundup, any grand themes to my reading in 2022. I didn't set out to intentionally cover a particular subject, or to read something that I'd been putting off — in fact, I pretty much just read for pleasure. I think it was that kind of year.

My total, as of the time of writing, is 151 books finished, totalling about 55 thousand pages. Two thirds of those were by women or nonbinary authors, and about one third were people of color. Most of my reading was either science fiction, fantasy, or thriller. Twelve books were non-fiction, and only 20 were re-reads.

This is a lot of books and a lot of pages, and most of them weren't very good. In fact, I think one thing I learned this year was to trust myself more on first impressions: there are several titles in the sheet that I bailed on early, then saw in a list or in the "most popular" sort for the library, and thought "I'll give that another shot." Almost without exception I regretted it later.

Since there's no real theme to the reading, and a lot of it was chaff, let's take a look at some of the more exceptional titles.

I can't say enough good things about Rosemary Kirstein's Steerswoman books. They start off as a kind of low fantasy, but it quickly becomes clear that there's more going on. The main character is a "steerswoman," a kind of roving scholar with a simple code: they'll answer any question you have, as long as you answer theirs. The four books in the series so far are satisfying in and of themselves — these were originally published by an actual company, but the rights belong to Kirstein now, and there are two more on the way. I'm extremely excited to see those out. In many ways these remind me of Laurie J. Marks' Elemental Logic series, in that they both mix wide scope with very personal ethics, and also that they're long-running books that are universally loved by the criminally small number of people who have read them.

Like her Claire DeWitt mysteries, Sarah Gran's The Book of the Most Precious Substance, combines a love of esoteric mystical literature with a noir tone. In this case, instead of a PI who learned her methodology from a French detection manual, it's a book dealer on the hunt for a Necronomicon-like book of magic that will net a profit — as well as more exotic rewards. Gran has a gift for a very specific voice, so if you've enjoyed her other works or you're looking for a capable-but-broken female protagonist, it might be worth checking any of these out.

Sarah Gailey's The Echo Wife remains one of my favorite books of the last decade, and if their Just Like Home can't quite reach those highs, it's still a page-turner. Vera Crowder returns home as her mother is dying, to the house where her father killed and buried half a dozen people during her childhood. The result is a queasy exploration of guilt and culpability, as Vera attempts to understand her own feelings toward her family, and her role in the murders. It may not quite land the ending, but Gailey still milks a tremendous amount of tension from an economical cast and setting, and I'm looking forward to re-reading it in a year or so for a reappraisal. It's wild to imagine all this from an author who first landed on the scene with a goofy fun "steampunk hippo cowboy" novella.

In a year when social networks seem to be imploding left and right, An Ugly Truth may feel redundant. Who needs to read a book about Facebook, i.e. 4Chan for your racist boomer relatives? Yet Frenkel and Kang's detailed account of the Cambridge Analytica era makes a strong case that we still haven't reckoned with just how dumb, sheltered, and destructive Mark Zuckerberg and his company have been. If you have not yet accepted that these kinds of tech companies are the Phillip Morris of our generation, this book might convince you.

Nona the Ninth was a tough read for me. I adore Tamsyn Muir's previous books, Gideon and Harrow, and I'm still very much interested to see how she wraps the whole series up. But the explanation around this book was that it started as the opening chapters to that final book, and as it kept growing, it was eventually split off into its own title. I think you can feel that: this is not a book where a lot is happening. It is backstory and setup for the actual ending — well-written, charming setup, because Muir is still funny as hell, but setup nonetheless. I finished it very much feeling like she was stalling for time, in a way that middle chapters often do, but rarely so explicitly.

Kate Beaton's Ducks was also long-anticipated, and here I think the hype was justified. Beaton is known for her history-nerd comic, Hark a Vagrant, as well as some children's books. She's a funny and expressive illustrator, but here she turns those talents to telling the story of her own experiences working on the oil sands in Canada. In many ways, it's a history of an abusive relationship — not just Beaton herself, but her community, trapped in a cycle of dependence on an abusive and destructive industry. Part of what makes this book compelling is Beaton is clear-eyed about the ways that same environment could be funny, or charming, without ignoring its inherent harm.

Finally, Ruthanna Emrys' A Half-Built Garden is, among other things, pointing a new direction for ecological science fiction in an era of climate change. A highly-networked anarchist commune working to clean up the Chesapeake watershed is shocked one day to find that aliens have landed in their backyard, who make an offer: they're here to help, and by help they mean "move humans off the earth," which they see as a doomed ecosystem. And even if the commune isn't interested, the corporations who ruined the planet most certainly are. The resulting negotiations give Emrys a way to poke at all kinds of interesting angles, including social software, for-profit pronouns, found family. While you could lump this into the "cozy sci-fi" movement that started with Becky Chambers, I think it would be a mistake, and that Garden has grander ambitions than it immediately seems. I think about this book a lot.

December 27, 2022

Filed under: gaming»impressions

2022, Mediated: Games

This year, I kept a log in a spreadsheet of the media I took in: books I read, movies I watched, and games I played. As 2022 wraps up, I want to take a moment and look back. I don't do this kind of record-keeping every year — it has the downside of making enjoyment into homework of a kind — but it can be an interesting view into something that might otherwise blur together.

I'm going to start with games, because they're the longest experience of the three. As a result, while I only wrote down books and movies if I finished them (or came very close), I wrote down games when I started them. I was more likely to abandon a game if it turned out I wasn't actually having a good time, and while I may add some titles to my book and movie lists before January 1, I feel pretty confident that I can write now about the shape of the year with relative accuracy.

At the time of this writing, I played about 90 games in 2022. That sounds like a lot, but I finished less than half of them (a metric that's complicated by "evergreen" games like Devil Daggers or roguelike games like Atomicrops or Risk of Rain 2. A number of these were also short, or I just dipped into them and then dipped back out: Landlord of the Woods is about 45 minutes long, and I loaded Rez Infinite up just long enough to run through the new levels again on a whim. I'd estimate half of them were actually serious time investments.

Roughly two-thirds of what I played was new to me, although rarely new releases. However, there's a fun correlation here: I actually completed 2/3 of games I replayed, while those proportions are reversed for new games. I suspect this is because I was more likely to get back into something that I already knew I enjoyed, whereas a lot of the new titles are "browsing": trying out GBA games that I missed during the console's lifetime, wandering through my Steam back catalog, or impulse purchases during sales.

The Year of Souls

In retrospect, soulslikes loomed prominently over my habits this year (as, indeed, they've become pretty influential across the industry). It started in January, when I replayed Sekiro: Shadows Die Twice, a game that I thought was fine in 2019 and grew to strongly appreciate through a second run.

High off the Sekiro experience, I tried Bloodborne again, and I also gave Dark Souls Remastered a sustained attempt. In both cases, I got through a significant portion of the game (up to Vicar Amelia in the former, and most of Anor Londo in the latter) before admitting that while I am sure I could get farther, I just wasn't having fun. I just don't think these games are very good, personally — they feel sluggish (the parries in both are trash), and Dark Souls in particular has not aged well visually, with a real "asset pack" AA-budget vibe to it.

Unfortunately, what I've realized is that the stuff I really like about Sekiro — its mechanical purity, responsive combat with (limited) action cancels, an explicit narrative — are mostly outliers in Fromsoft's catalog. Simultaneously, the things that I find infuriating, like its befuddling and opaque quest chains or cheap encounter design, are in fact the aspects that draw in their most devoted fans.

Still, many of my favorite titles this year were non-Fromsoft soulslikes. The Surge 2 tries a high-risk-high-reward mechanic for parries that's initially frustrating but ultimately feels rewarding to master. Remnant: From the Ashes is a semi-procedural shooter with some great environment design. Tunic is playing more in the adventure space, but there are elements there in its combat and narrative design that are clearly evoking Miyazaki.

There were also some missteps. Ashen is probably the closest to a Fromsoft game (and has at least one dungeon that almost drove me away) but the art direction and writing kept me interested, as each victory builds out your hometown. Darksiders III is a cash grab in a franchise whose brawler roots don't mesh particularly well with punishing checkpoints, but it managed to eke out a few last drops of charm. Neither of these was bad enough to stop playing, but I also can't see myself revisiting them, or recommending either to other people.

(From last year, but also illuminating: Jedi: Fallen Order is blatantly pulling from Sekiro for its lightsaber combat (no complaints here), and its late-game character reveal had me cackling. Death's Door was in many ways a precursor to Tunic, with its Metroidvania progression and isometric combat, and I would argue it's a better game even if it doesn't have the latter's clever manual gimmick.)

As a genre experiment, this year was clarifying. I think I've got a better grasp now on what works for me, and what doesn't. I also feel freshly inoculated against, for example, Elden Ring, which should save me the frustration of playing 30% of a 120-hour game. We'll see whether that lasts.

Other noteworthies, in chronological order

Don't be too put off by the weird, swollen art style of Atomicrops. The underlying combination of light farming sim and bullet-hell twinstick shooter ate up a lot of hours in January. I played it on Switch, and while it's beatable (and fun) there it also feels like it wasn't optimized for the platform — the final boss turns into a slideshow. I'd recommend it, but probably on PC.

Halo Infinite took a lot of criticism for effectively being "what if we made the whole game out of Silent Cartographer," and parts of it do wear thin when it turns into an Ubisoft Map Game. But as the Master Chief Collection rolled the games out on PC, I'd played through all of them fairly recently, and I think you could do a lot worse than an entire game made out of Silent Cartographer (you could, for example, play Halo 4). I would argue this is the game they've been aiming toward for two decades.

I'm as surprised as anyone that in 2022 there would be a game that is A) based on a Marvel property, B) specifically Guardians of the Galaxy, and C) actually pretty good. Eidos Montreal's 2021 title is chatty, irreverent, and pulls a lot of the touchstones of the James Gunn films (a non-stop commentary from the team, Quill's tape deck, the Bautista take on Drax) while ditching their more obnoxious tics (some needlessly fatphobic humor, Chris Pratt). I think the combat does often feel weightless — my advice is to set it to easy so you can get through it faster and get back to the writing and performances.

Immortality is one of those games that's going to have a big influence conceptually but not mechically. It's an FMV title where you're essentially handed a big box of isolated clips from the career of a b-movie actress, roughly grouped into three films: a giallo-style religious tale, a noir in the style of Basic Instinct, and an extremely 90's thriller that wouldn't be out of place on the Lifetime Movie Network. As you scrub through and build connections between the clips (linked by clicking on objects in a paused frame), a second, more sinister narrative emerges. As a film buff, this felt like it was aimed right at me, and while it can drag a bit when you find yourself hunting the last few segments, I think it achieves exactly what it set out to do.

Off the Hook

Finally, I don't think I can wrap up without mentioning Splatoon 3, a game that was only released in September and probably has more hours in it than anything else I've played. I was S+ rank in Splatoon 2, meaning fairly high-level but not elite (I believe the rank roughly translates into the lower end of the top 10% of players). So I was really looking forward to this.

In design, Splatoon 3 is pure Nintendo. It feels good play, with varied weapons and precise motion controls. It's brightly colored and fashionable, and has a non-toxic and notoriously LGBT-friendly community with lots of in-game creativity on display. The game's lore is weird and surprisingly grim. Taking team shooter concepts like map control/movement and translating them into literal painted areas is brilliant. Also, the soundtrack is fantastic.

The other classic Nintendo move is the networking stack, which is one of the most atrocious technical foundations for a multiplayer game that I've ever seen. It's barely dysfunctional: connections drop regularly, which completely cancels matches and counts as a loss for the disconnecting player, and the matchmaking is laughably bad in the regular ranked mode. It's a tribute to how good everything else is that it can be addictive despite a glaring central flaw.

Splatoon 3 adds a bunch of things that are different from its predecessor, but not always better. For example, the end game poses are no longer gendered and the clothing options are massively more flexible, but to work around that they've added a "catalog" season pass system that unlocks new ending poses or nametags as you play. Since players need to show those off, the game now only shows the winners at the end of the match, which means they cut the adorable tantrum/sulk animations and the more distinctive music after a loss. I do miss it, even while I do enjoy the new variety (and the vastly improved lobby area).

Gripes aside, at the end of the day, if you want a Splatoon experience (and I do), this is where you have to go for it. Nobody else makes anything like this. There's no "splatoonlike" genre, as inconceivable as that seems. It's Nintendo's way or the highway.

There was a lot of noise in 2022 about how the Switch hardware is aging. This isn't wrong! The Tegra chip that the console is based on was not really cutting-edge when it launched, and it's certainly not competing with other consoles — or even phones — at this point. Even so, the Switch is probably at least 50% of my gaming time, and although I have a PS4 hooked up to the same TV, it's almost always used as a DVD player instead.

If you think back to the PS3/360 era, there was a lot of noise made about the first real "HD" consoles. This was, to be fair, a real shift, one that meant games looked sharper but were also radically more expensive. But there were also changes in the kinds of games that became possible at that level of power. This is the time when we first started seeing open-world games like Assassin's Creed or Oblivion, which were not only very big, but also had bustling populations of NPCs and emergent behavior. There's a real case here of new kinds of game design being unlocked by the new generation of hardware.

In the Switch's case, these are often the same kinds of games that it struggles to run at full fidelity (Breath of the Wild excepted, and even there, it's a full world but not a busy one). But when the developer takes more control over the camera or the gameplay, it can return really good results. And in some cases, it can be pretty incredible — the Neir Automata port this year is certainly not as detailed as it is on a PC, but it's shockingly good.

It may be that there are some designs that are unlocked by the PS5/XB1 consoles, just as the open-world genre only really hit its stride a couple of generations back. But it's not clear to me what those are, and in the meantime, I do kind of wish the treadmill would slow down a little. Obviously there's an incentive for them, but Splatoon is a reminder that the Switch can be plenty compelling when developers target the hardware they have, and not what they wish they had.

October 25, 2022

Filed under: journalism»writing

Semi-formal

An uncomfortable truth of modern web journalism is that most people only read the headlines. That's what happens when most of your interactions are mediated through social media, where the headline and a brief teaser surface in the share card, and then all the "fun" interaction is arguing about it in the responses.

There are sensible reactions to this (high on the list: stop letting the copy desk pick the headlines for stories they didn't report) and then there's the new wave of web publications (Politico Pro, Axios, and now Semafor) that have instead decided that the ideal strategy is to just write the story like a social media blurb anyway. From CJR:

Author bylines are, as promised, as prominent as headlines, but the meat of the Semaform concept comes in the text of the story itself, which is broken into distinct sections, each preceded by a capitalized subheading: “THE NEWS” (or “THE SCOOP”), offering the “undisputed facts” of a given story; “THE REPORTER’S VIEW,” which is what it sounds like, with an emphasis on “analysis”; “ROOM FOR DISAGREEMENT,” which is also what it sounds like; “THE VIEW FROM,” promising “different and more global perspectives” on the story in question; and “NOTABLE,” linking out to worthwhile related coverage from other outlets.

I don't consider myself a particularly old-fashioned news reader — I've spent most of my career trying to convince reporters and editors to get a little creative with their formats — but I admit to a visceral repulsion when I read these stories, maybe because they're so proscribed. They often feel, as Timothy Noah writes, so boiled down that they actually impede understanding. They can't be skimmed because there's nothing but skim there.

Even worse, the adherence to the fill-in-the-blanks writing formula (with its pithy, repetitive headers) does its best to drain any distinctiveness from the writers, even while it puts bylines front and center. Take, for example, this David Weigel piece on Oregon Democrats, which chafes deeply against the "Semaform." Weigel gives us 23 paragraphs that would not have been out of place in his Washington Post reporting, followed by a single paragraph of "David's View" (as if the previous reporting was not also his viewpoint), then a "Room for Disagreement" that... doesn't actually disagree with anything. And then "The View from the U.K.," which is a mildly amusing dunk on a British tabloid reporter but adds nothing to the story.

For a more "typical" example of the form, there's this story by Kadia Goba on Marjorie Taylor Greene's deranged anti-trans legislation. Goba is less of a "name," which may explain why her piece is less of a newspaper article with some additional sections jammed onto the end, but it still reads as if a normal inverted-pyramid piece had the subheads inserted at arbitrary locations. The final "View from the U.K." feels like twisting the knife: and now the news from TERF Island.

Here's the thing, to me: picking a story form like this is a great way to make sure nobody can ever remember a story an hour after reading it, because they all blend together. Why hire good journalists if you're not going to let them write? You're never going to get something like Lynda V. Mapes' adorable Rialto coverage in Semafor's article template. It doesn't make any sense for investigative writing. You're certainly not going to get the kinds of interactive or creative storytelling that I work on (although, given that Semafor's visual aesthetic is somewhere between "your dad made a Powerpoint meme" and "Financial Times circa 2008," I'm not sure they care).

Above all, these new outlets feel like a bet on a very specific future of news: one where it's very much a market commodity, as opposed to something that can be pleasurable or rewarding in itself. And maybe that's the right bet! I have my doubts, but my hit rate is no better than any other industry thinker, and I assume you wouldn't do something this joyless without a lot of market research indicating that you can sell it to somebody. But as someone who's been more and more convinced that the only sustainable path for journalism is non-profit, that person isn't me.

June 19, 2022

Filed under: tech»coding

The Many-Threaded Hydra

The Emperor had set out to beat not just Gurgeh, but the whole Culture. There was no other way to describe his use of pieces, territory and cards; he had set up his whole side of the match as an Empire, the very image of Azad.

Another revelation struck Gurgeh with a force almost as great; one reading — perhaps the best — of the way he'd always played was that he played as the Culture. He'd habitually set up something like the society itself when he constructed his positions and deployed his pieces; a net, a grid of forces and relationships, without any obvious hierarchy or entrenched leadership, and initially quite peaceful.

[...] Every other player he'd competed against had unwittingly tried to adjust to this novel style in its own terms, and comprehensively failed. Nicosar was trying no such thing. He'd gone the other way, and made the board his Empire, complete and exact in every structural detail to the limits of definition the game's scale imposed.

Iain M. Banks' classic novel Player of Games follows Jernau Morat Gurgeh, who is sent from the Culture (a socialist utopia that's the standard setting for most of Banks' genre fiction) to compete in a rival society's civil service exam, which takes the form of a complicated wargame named Azad. The game is thought by its adherents to be so complex, so subtle, that it serves as an effective mirror for the empire itself.

Azad is, obviously, not real — it's a thought experiment, a clever dramatic conceit along the lines of Borges' famous 1:1 scale map. But we have our own Azad, in a way: as programmers, it's our job to create systems of rules and interactions that model a problem. Often this means we intentionally mimic real-world details in our code. And sometimes it may mean that we also echo more subtle values and viewpoints.

I started thinking about this a while back, after reading about how some people think about the influences on their coding style. I do think I have a tendency to lean into "playful" or expressive JavaScript features, but that's just a symptom of a low boredom threshold. Instead, looking back on it, what struck me most about my old repos was a habitual use of what we could charitably call "collaborative" architecture.

Take Caret, for example: while there are components that own large chunks of functionality, there's no central "manager" for the application or hierarchy of control. Instead, it's built around a pub/sub command bus, where modules coordinating through broadcasts of custom events. It's not doctrinaire about it — there's still lots of places where modules call into each other directly (probably too many, actually) — but for the most part Caret is less like a modern component tree, and more like a running conversation between equal actors.

I've been using variations on this design for a long time: the first time I remember employing it is the (now defunct) economic indicator dashboard I built for CQ, which needed to coordinate filters and views between multiple panels. But you can also see it in the NPR primary election rig, Weir's new UI, and Chalkbeat's social media card generator, among others. None of these have what what we would typically think of as a typical framework "inversion of control." I've certainly built more traditional, framework-first applications, but it's pretty obvious where my mind goes if given free rein.

(I suspect this is why I've taken so strongly to web components as a toolkit: because they provide hooks for managing their own lifecycle, as well as direct connection to the existing event system of the DOM, they already work in ways that are strongly compatible with how I naturally structure code. There's no cost of convenience for me there.)

There are good technical reasons for preferring a pub/sub architecture: it maps nicely onto the underlying browser platform, it can grow organically without having to plan out a UML diagram, and it's conceptually easy to understand (even if you don't just subclass EventTarget, you can implement the core command bus in five minutes for a new project). But I also wondered if there are non-technical reasons that I've been drawn to it — if it's part of my personal Azad/Culture strategy.

I'm also asking this in a very different environment than even ten years ago, when we used to see coyly neo-feudalist projects like Urbit gloss over their political design with a thick coat of irony. These days, the misnamed "web3" movement is explicit about its embrace of the Californian ideology: not just architecture that exists inside of capitalism, but architecture as capitalism, with predictable results. In 2022, it's not quite so kooky to say that code is cultural.

I first read Rediker and Linebaugh's The Many-Headed Hydra: Sailors, Slaves, Commoners, and the Hidden History of the Revolutionary Atlantic in college, which introduced me to the concept of hydrarchy: a type of anarchism formed by the "motley crew" of pirate ships in contrast to the strict class structures of merchant companies. Although they still had captains who issued orders, that leadership as not absolute or unaccountable, and it was common practice for pirates to put captured ship captains at the mercy of their crews as a taste of hydrarchy. A share system also meant that spoils were distributed more equally than was the case on merchant ships.

The hydrarchy was a huge influence on me politically, and it still shapes the way I manage teams and projects. But is it possible that it also influenced the ways I tend to think about and write code systems? This is a silly question, but not I think a stupid one: a little introspection can be valuable, especially if it provides insight in how to explain our work to beginners or accommodate their own subconscious worldviews.

This is not to say that, for example, Caret is an endorsement of piracy, or even a direct analog (certainly not in the way that web3 is tied to venture capitalism). But it was built the way it was because of who did the building. And its design did have cultural implications: building on top of events means that you could write a Caret plugin just by sending messages to its Chrome process, including commands for the Ace editor. The promise (not always kept, to be fair) was that your external code was using the same APIs that I used internally — that you were a collaborator with the editor itself. You had, as it were, an equal share in the outcome.

As we think about what the "next era of JavaScript" looks like, there's a tendency to express it in terms of platforms and layers. This isn't wrong! But if we're out here dreaming up new workflows empowered by edge computing, I think we can also spare a little whimsy for models beyond "pure render functions" or "strict hierarchy of control," and a little soul-searching about what those models for the next era might mean about our own mindsets.

March 31, 2022

Filed under: tech»open_source

CTRL alt QMK

Like a lot of people during the pandemic, early last year I got into mechanical keyboard collecting. Once you start, it's an easy hobby to sink a lot of time and money into, but the saving grace is that it's also ridiculously inconvenient even before the supply chain imploded, since everything is a "group buy" or some other micro-production release, so it tends to be fairly self-limiting.

I started off with a Drop CTRL, which is a pretty basic mechanical that serves as a good starting point. Then I picked up a Keychron Q1, a really sharp budget board that convinced me I need more keys than a 75% layout, and finally a NovelKeys NK87 with Box Jade clicky switches, which is just just a lovely piece of hardware and what I'm using to type this.

All three of these keyboards are (very intentionally) compatible with the open-source QMK firmware. QMK is very cool, and ideally it means that any of these keyboards can be extended, customized, and updated in any way I want. For example, I have a toggle set up on each board that turns the middle of the layout into a number pad, for easier spreadsheet edits and 2FA inputs. That's the easy mode — if you really want to dig in and write some C, these keyboards run on ARM chips somewhere on the order of a Nintendo DS, so the sky's pretty much the limit.

That said, "compatible" is a broad term. Both the Q1 and NK87 have full QMK implementations, including support for VIA for live key-remapping and macros, but the CTRL (while technically built on QMK) is usually configured via a web service. It's mostly reliable, but there have been a few times in the last few months where the firmware I got back after remapping keys was buggy or unreliable, and this week I decided I wanted to skip the middleman and get QMK building for the CTRL, including custom lighting.

Well, it could have been easier, that's for sure. In getting the firmware working the way I wanted it, I ended up having to trawl through a bunch of source code and blog posts that always seemed to be missing something I needed. So I decided I'd write up the process I took, before I forget how it went, in case I needed it in the future or if someone else would find it helpful.

Building firmware

The QMK setup process is reasonably well documented--it's a Python package, mostly, wrapped around a compilation toolchain. It'll clone the repo for you and install a qmk command that manages the process. I set mine up on WSL and was up and running pretty quickly.

Once you have the basics going, you need to create a "keymap" variation for your board. In my case, I created a new folder at qmk_firmware/keyboards/massdrop/ctrl/keymaps/thomaswilburn. There are already a bunch of keymaps in there, which is one of the things that gives QMK a kind of ramshackle feel, since they're just additions by randos who had a layout that they like and now everyone gets a copy. Poking around these can be helpful, but they're often either baroque or hyperspecialized (one of them enables the ability to programmatically trigger individual lights from terminal scripts, for example).

However, the neat thing about QMK's setup is that the files in each keymap directory are loaded as "overrides" for the main code. That means you only need to add the files that change for your particular use, and in most cases that means you only need keymap.c and maybe rules.mk. In my case, I copied the default_md folder as the starting place for my setup, which only contains those files. Once that's done, you should be able to test that it builds by running qmk compile -kb massdrop/ctrl -km thomaswilburn (or whatever your folder was named).

Once you have a firmware file, you can send it to the keyboard by using the reset button on the bottom of the board and running Drop's mdloader utility.

Remapping

QMK is designed around the concept of layers, which are arrays of layout config stacked on top of each other. If you're on layer #3 and you press X, the firmware checks its config to see if there's a defined code it should send for that physical key on that layer. QMK can also have a slot defined as "transparent," which means that if there's not a code assigned on the current layer, it will check the next one down, until it runs out. So, for example, my "number pad" layer defines U as 4, I as 5, and so on, but most of the keys are transparent, so pressing Home or End will fall through and do the right thing, which saves time having to duplicate all the basic keys across layers.

If your board supports VIA, remapping the layer assignments is easy to do in software, and your keymap file will just contain mostly empty layers. But since the CTRL doesn't support VIA, you have to assign them manually in C code. Luckily, the default keymap has the basics all set up, as well as a template for an all-transparent layer that you can just copy and paste to add new ones. You can see my layer assignments here. The _______ spaces are transparent, and XXXXXXX means "do nothing."

There's a full list of keycodes in the QMK docs, including a list of their OS compatibility (MacOS, for example, has a weird relationship with things like "number lock"). Particularly interesting to me are some of the combos, such as LT(3, KC_CAPS), which means "switch to layer three if held, but toggle caps lock if tapped." I'm not big on baroque chord combinations, but you can make the extended functions a lot more convenient by taking advantage of these special layer behaviors.

Ultimately, my layers are pretty straightforward: layer 0 is the standard keyboard functions. Layer 1 is fully transparent, and is just used to easily toggle the lighting effects off and on. Layer 2 is number pad mode, and Layer 3 triggers special keyboard behaviors, like changing the animation pattern or putting it into "firmware flash" mode.

Lighting

Getting the firmware compiling was pretty easy, but for some reason I could not get the LED lighting configuration to work. It turns out that there was a pretty silly answer for this. We'll come back to it. First, we should talk about how lights are set up on the CTRL.

There are 119 LEDs on the CTRL board: 87 for the keys, and then 32 in a ring around the edges to provide underglow. These are addressed in the QMK keymap file using a legacy system that newer keyboards eschew, I think because it was easier for Drop to build their web config tool around the older syntax. I like the new setup, which lets you explicitly specify ranges in a human-readable way, but the Drop method isn't that much more difficult.

Essentially, the keymap file should set up an array called led_instructions filled with C structs configuring the LED system, which you can see in my file here. If you don't write a lot of C, the notation for the array may be unfamiliar, but these unordered structs aren't too difficult from, say, JavaScript objects, except that the property names have to start with a dot. Each one gets evaluated in turn for each LED, and a set of flags tells QMK what conditions it requires to activate and what it does. These flags are:

  • LED_FLAG_USE_PATTERN - indicates that you're going to set a specific pattern by index from the set of different animations that the CTRL ships by default. For example, .pattern = 3 should activate the teal/salmon gradient.
  • LED_FLAG_USE_ROTATE_PATTERN - indicates that you want to use the user-selectable pattern, which the user can switch between using hotkeys.
  • LED_FLAG_USE_RGB - indicates that instead of using a preset color or pattern, you'll provide custom RGB values for the LEDs.
  • LED_FLAG_MATCH_LAYER - will only apply this lighting when the current layer matches the provided index.
  • LED_FLAG_MATCH_ID - will only apply this lighting to LEDs matching an ID bitmask.
Combining these gives you a lot of flexibility. For example, let's say I want to light up the keys in the "number pad" (7-9, U-O, J-L, and M-period) in bright green when layer #2 is active. For that case, the struct looks something like this:
{
  .flags = LED_FLAG_MATCH_LAYER | 
    LED_FLAG_USE_RGB | 
    LED_FLAG_MATCH_ID,
  .g = 255, 
  .id0 = 0x03800000,
  .id1 = 0x0E000700,
  .id2 = 0xFF8001C0,
  .id3 = 0x00FFFFFF,
  .layer = 2
},
The flags mean that this will only apply when the active layer matches the .layer property, we're going to provide color byte values (just .g in this case, since the red and blue values are both zero), and only LEDs matching the bitmask in .id0 through .id3 will be affected.

Most of this is human-readable, but those IDs are a pain. They are effectively a bitmask of four 32-bit integers, where each bit corresponds to an LED on the board, starting from the escape key (id 0) and moving left-to-right through each row until you get to the right arrow in the bottom-right of the keyboard (id 86), and then proceeding clockwise all around the edge of the keyboard. So for example, to turn the leftmost keys on the keyboard, you'd take their IDs (0 for escape, 16 for `, 35 for tab, 50 for capslock, 63 for left shift, and 76 for left control), divide by 32 to find out which .idX value you want, and then modulo 32 to set the correct bit within that integer (in this case, the result is 0x00010001 0x80040002 0x00001000). That's not fun!

Other people who have done this have used a Python script that requires you to manually input the LED numbers, but I'm a web developer. So I wrote a quick GUI for creating the IDs for a given lighting pattern: click to toggle a key, and when the diagram is focused you can also press physical keys on your keyboard to quickly flip them off and on. The input contains the four ID integers that the CTRL expects when using the LED_FLAG_MATCH_ID option.

Using this utility script, it was easy to set up a few LED zones in a Vilebloom theme that, for me, evokes the classic PDP/11 console. But as I mentioned before, when I first started configuring the LED system, I couldn't get anything to show up. Everything compiled and loaded, and layers worked, but no lights appeared.

What I eventually realized, to my chagrin, was that the brightness was turned all the way down. Self-compiled QMK tries to load settings from persistent memory, including the active LED pattern and brightness, but I suspect the Drop firmware doesn't save them, so those addresses were zero. After I used the function keys to increase the backlight intensity, everything worked great.

In review

As a starter kit, the CTRL is pretty good. It's light but solidly constructed with an aluminum case, relatively inexpensive, and it has a second USB-C port if you want to daisy-chain something else off it. It's a good option if you want to play around with some different switch options (I added Halo Clears, which are pingy but have the same satisfying snap as that one Nokia phone from The Matrix).

It's also weirdly power-hungry, the integrated plate means it's stiff and hard to dampen acoustically, it only takes 3-prong switches, and Drop's software engineering seems to be stretched a little thin. So it's definitely a keyboard that you can grow beyond. But I'm glad I put the time into getting the actual open source firmware working — at the very least, it can be a fun board for experimenting with layouts and effects. And if you're hoping to stretch it a little further than its budget roots, I hope the above information is useful.

October 11, 2021

Filed under: politics»issues»technology

Filterless

A few years ago, right when I moved to Chicago, I was working for a small tech consulting company. The founders were decent guys, well-meaning, progressive. I didn't love the work — "the customer is always right" is not a mindset I easily adopt — but it seemed generally harmless. So one day I was surprised to hear, in an all-staff meeting, that a coworker's web performance audit for Philip Morris had been a rousing success, and might lead to follow-up work.

Listen, I said in my regular check-in with the COO, I know we've got to pay the bills. I'm not trying to be a drama magnet. But this is Philip Morris, one of the most amoral corporate predators in the world. My grandfather died of cancer after smoking his whole life. It's a complex world, but I believe you can draw a few bright lines even under capitalism, and the two easiest examples are Literal Nazis and Big Tobacco.

Ever since I left that company, when I apply somewhere, I try to mention this scenario and ask "What are the clients you wouldn't take? Do you have a process for making that decision?" You may or may not be surprised to find that most people do not have a particularly good response. I interviewed for one well-known agency, and after a pause, the manager said "...that's a very 'journalist' question." I assume that was not meant to be flattering. They'd done work for Facebook, he said, after thinking about it, and for a lot of people that might be the same thing.

This week, there are a lot of comparisons between Facebook and Big Tobacco as whistleblower testimony from Frances Haugen confirms that the company is not only leaning on addiction as a business strategy, but has also been sitting on internal research about how awful its product is (Big Oil is probably a closer parallel). These discoveries aren't new. But Haugen's testimony and leaks are doing a good job of cutting through the usual dynamic around regulating Facebook, where Republicans insist that the company is biased against them (it's emphatically not), and Democrats wring their hands ineffectually.

Let's be clear: Philip Morris made $8 billion last year in profit. It's still a member of the Fortune 500. Apparently, for a lot of people, it's just another client. If this is the comparison for Facebook, they're going to be fine.

But at the same time, regardless of the bottom line, you can see the perception changing. Remember the interview where I was told that it was a "journalist" question to ask about client choice? This week, that agency's founders spent their podcast extending the tobacco and oil comparisons, arguing that Facebook should be regulated if it can't be eradicated. Now, maybe the right hand and left hand aren't talking to each other here (I wasn't interviewing with either of the speakers on the podcast), but that feels like a shift to me.

In the web community, it's time to start collectively questioning the norms around collaborating with Menlo Park. We have the ability to change the perception and access of Facebook — just look at Oracle or Intellectual Ventures! Like the tobacco companies, they're still profitable, of course. The market's gonna market. But it's harder for them to hire. Their influence and mindshare in shaping the conversation are substantially diminished. Nobody's excited about their output.

It was weird to me, when I questioned the Philip Morris contract, that nobody else seemed to have raised an issue. It hadn't even really occurred to them. But it definitely stuck with management: even when I left to go to NPR, it came up in the exit interview. There was a little unease that hadn't been there before. Sometimes that's all it takes.

Facebook, and by extension working with Facebook, or using code that comes from Facebook, should be considered embarrassing, or shameful. You can argue against using React or GraphQL on valid technical terms, in addition to the obvious moral hazard. Integrations with Facebook code should be isolated, treated as untrustworthy, and built in such a way that they can be replaced. When you meet Facebook employees at conferences or gatherings, let them know that it's nothing personal, but you just don't feel clean building on top of that legacy, and you hope they find a better place to work soon.

Imagine an airport smoking lounge: a dingy, nicotine-stained room where participants have to stand and face each other, away from everyone else. Now make that for Facebook. Regulation takes time and lobbying, but low-key shame is free and easily renewable.

September 21, 2021

Filed under: tech»coding

I am FM

My last day at NPR was September 3, and I started at Chalkbeat on September 13. In the nine days in between, I tried to detox: I stayed away from the news, played a lot of Castlevania, and — in an effort to not feel completely useless — worked on a project I've been meaning to tackle for a while: I wrote a browser-based FM synth modeled on the classic Yamaha DX-7.

The DX-7 is the classic Lament Configuration of digital sound design. It's not only based on a model of synthesis that's unintuitive, but Yamaha wrapped it in a pushbutton user interface that discourages experimentation. Its sound defined an era almost entirely through the presets: the piano arpeggios from Twin Peaks, countless Whitney Houston ballads, and the bass line from Take on Me. I never owned a genuine DX-7, but I had one of Yamaha's budget models, and learning to build sounds on it was a long-standing white whale of mine.

Modulation Operations

Most synthesizers are what we call additive and subtractive. You generate a waveform, either by combining different wave shapes (sine, rectangle, sawtooth, or triangle) or using a noise generator, and then patch that through a series of filters and effects, and out the other end either emerges a transcendant reinterpretation of Bach (if you're Wendy Carlos) or a kind of deranged squawking (if you're me). This kind of synthesis isn't easy, per se, but it makes sense to someone who has used, say, a guitar pedalboard.

The DX-7 works differently, using something called frequency modulation (FM) synthesis. Essentially, it uses up to six sine wave oscillator units (called "operators"), but most of them aren't audible at any given time. Instead, the secondary units (called "modulators") are used to tweak the frequency of the audible operators ("carriers"). When the wave output of the modulator goes up, so does the frequency of its carrier. When the wave goes down, the freqency dips. Since these changes in frequency happen many times a second, and are often scaled to the input pitch from the keyboard, the result are complicated harmonic patterns, often described as metallic, bell-like, percussive.

In the original DX-7 hardware, this is all done using a polynomial math equation, effectively shifting the sample location for the carrier wave based on the modulator value (there's a useful .gif at Wikipedia illustrating the principle). You can do this with relatively cheap processors, and indeed that's how most of the JavaScript implementations still do it: they generate a stream of audio data directly from the phase math, and pipe that to an output. But for the sake of prototyping, I decided to do it a different way, using the native WebAudio processing graph.

Adapting theory to practice

WebAudio is a kind of beautiful monstrosity. It's easy to imagine an API designer deciding that browser audio should be mostly WebGL-style primitives, basically just handing you an audio buffer array and leaving the sound generation up to you. Or the pendulum could have swung the other way, toward extreme user-friendliness, with just a slightly more performant version of the <audio> tag letting you load and trigger preset clips.

Instead, the final API ends up looking more akin to a classic Moog patchbay or a studio effects rack, letting you wire various modules together into a complex signal chain. Those nodes start out as simple oscillators and gain amplifiers, but from there it gets pretty batteries-included: nodes for impulse convolution, multiple shaped filters, and compression, plus a custom "script worker" node as an escape hatch.

Crucially for my purposes, WebAudio signal nodes can be wired to more than just audio inputs and outputs. You can also hook nodes into the control parameters, so that the output from one changes the volume or strength of another. In our case, we can use the audio signal from our modulators and pipe it into the frequency value of our carriers. It's not quite the same as the classic DX-7 formula, but it performs very well, and the sound actually isn't that far off. You can hear the classic EPIANO1 preset adapted to my code on the GitHub demo page for the project.

However, while this implementation felt more intuitive than juggling Math.sin(), WebAudio also has some quirks that made it tricky. For example, oscillators are single-shot: they can only be started and stopped once. The API is full of this kind of design, where you're supposed to create nodes, connect them to the graph, and then throw them away. But when you have modulator oscillators feeding into carrier oscillators in a complicated web of amplifiers and filters, disposable audio sources don't really fit the design.

In the end, I had to wrap the whole thing in a disposable Voice class that encapsulates an arrangement of operators for a single note. When the synth is asked to play a sound, it creates a Voice containing a fresh set of operators, hooks that up to the audio context, and sends it on its merry way. This effectively makes our synthesizer polyphonic by default, since each individual frequency gets its own voice on demand. It feels wasteful, but it works.

Gradual complexity

Working on a project like this makes me think a lot about how it is that I build projects, and how to teach others to do the same. We often tell junior developers that they should learn by creating something fairly complex, but we don't really tell them how to do it. I suspect this is because it becomes fairly instinctual over time, so it's hard to explain.

Part of what we don't tell junior developers is that big projects are built out of little projects, one level of abstraction at a time. For example, for the Hello Operator repo, the process of getting a (mostly) working synthesizer looks like:

  1. Hook up some basic oscillators and trigger them on a timer
  2. Wrap those oscillators in an Operator class and connect them together
  3. Instead of using a timer, set the keyboard to trigger playback
  4. Wrap the Operator objects in a Voice, so that they can be played repeatedly
  5. Add a MIDI keyboard and feed its input directly into the synth
  6. Wrap MIDI in an EventTarget so it can be used for more than just notes
  7. Add basic inputs that tweak the Operator settings, wired directly to MIDI
  8. Create a bad abstraction to marry browser UI to the operator settings across multiple parameters
  9. Replace that abstraction with something that handles updates regardless of source, whether from the browser UI or the knobs on the MIDI controller

I suspect that when we say "build bigger projects," what people hear is that their application needs to spring fully-formed from their head like Athena, but literally nothing I've ever built has been scoped that way. It's always been a gradual accretion of functionality. Caret, for example, started out as just a text box and a keyboard input, and everything else, from tabs to project management, grew from there.

It's not that I don't have a plan at all — I knew from the start, for example, that I'd want a solid system that encapsulated the MIDI handling code and turned it into something more JavaScript-friendly — but the point of experience is learning where to put the grotesque hacks that you'll later replace with those better systems. And you get that experience by failing to make good placeholders on your first few projects.

Did I accomplish my goal of learning to program a DX-7? No. But ultimately, for these kinds of projects, that's not really the point. I learned a lot about sound, how the browser processes it, and how to handle new kinds of input. One day, I might even finish it. Brian Eno, eat your heart out.

August 3, 2021

Filed under: tech»web

The Mythical Document Web

Through a confluence of issues, Safari (Apple's web browser, and the only browser allowed on iOS) has been a hot topic lately:

  • Multiple game streaming services have rolled out in the browser instead of through a centralized app store on iPhones, including Microsoft's Xbox Cloud. These are probably the highest profile web-only apps on iOS in years, and ironically Safari only recently became capable of hosting them.
  • iOS 14.1.1 shipped with a showstopper bug in the IndexedDB API, part of a long stream of bugs that break Safari's ability to store data locally and work offline. Because browser releases are tied to the OS, developers will have to work around this for at least half a year and probably more (since many users don't upgrade promptly).
  • The Safari team asked for feedback about what new features developers would like to prioritize, which reminded everyone that the existing features are largely broken and it's part of a systemic pattern of neglect and abuse. Lord knows I have my own collection of horror stories.

When these kinds of teapot tempests stir up, you can often sort the reaction from the technical community into a few buckets. At the extreme "actually, Safari is good" side, there are people who argue that the web should be replaced or downgraded into something more like Gemini, or restricted to the feature set of HTML 4 and CSS 2 (no scripting allowed). You know: cranks.

But you'll also see a second group proposing that "browsers should be for documents, not for apps" (e.g. browser developers should just stop adding new features entirely and let's split the web in two). In this line of thinking, a browser like Safari that refuses (or is slow) to implement new APIs or features is doing the world a service, because it keeps the ecosystem tilted toward the "document" side instead of the "app" platform side, where Google has too much influence. These opinions seem more reasonable on the surface, but they're also cranks — it's just harder to explain why.

The flaw in the "document browsers, not app platforms" argument is that it assumes that web APIs can be sorted into clear, easily distinguished buckets — or indeed, that there's a bright line between the two. In fact, as someone who almost entirely builds content pages (jargon about "news apps" aside), I often find that in conversations with "app" developers that I'm more experienced with new browser APIs than they are. Most client-side apps, like GMail or Trello, do not actually use that much of a browser's API surface. Even really ambitious applications like Figma mostly just need methods for storage and display, and they've had those (through IndexedDB and canvas) for at least a decade now.

Should browsers be simpler and easier to implement? This kind of argument often feels very intuitive to the "document web" advocates, because they're used to thinking about new APIs through the context of the marketing bullet points for a new operating system. But when you actually look over a list at Can I Use, an awful lot of the "new" APIs are just paving cowpaths: they're designed to replace or reduce common patterns that developers were already hacking onto pages.

  • Beacon API - lets you fire a request at a server without waiting for a response, which means that developers can stop intercepting link clicks and pausing navigation while they send an analytics ping.
  • Fetch - makes it easier to safely load information from a server, replacing XMLHttpRequest (which was hard to use) and JSONP (which was a security nightmare).
  • Intersection Observer - lets developers know when an element has entered or exited the visual viewport without having to poll constantly, which means scrolling gets smoother.
  • Web Crypto - keeps people from shipping huge crypto libraries as a part of their JS bundle, and supports privacy-first features like end-to-end encryption.
  • Web Assembly - creates a stable compilation target for other languages. Developers were already creating other languages that compile to JavaScript, Web Assembly just creates a standard interface and a predictable performance profile.
  • Web Sockets - replaces previous methods of getting fast updates on events, such as constant polling requests or persistent server connections that would take down Apache.
  • Various message channels - lets developers communicate between tabs without abusing sidechannels like window.name or local storage, useful for all the people who have GMail open in seven tabs because they never close anything.
  • Grid and flex layouts - replaces various hacks and JavaScript-based layout systems, including the holy grail: vertically-centered content.

Because JavaScript is a Turing-complete language and web browsers were originally designed with lots of holes in them, none of these APIs are really adding anything new to the browser — it's just that previously, this functionality would have been added by brute force. For example, before browsers created consistent ways to autoplay video without loading a large and dithered .gif file, there were scripts to "play" frames via canvas and a tiled .jpg. You'd be amazed the hacks like this I've seen (and some I've perpetrated).

Are there APIs in Chrome that cross into traditional native app territory? Sure, there's a few, like the Bluetooth or USB access APIs. But while pundits and native developers seem to think those are the vast majority of new browser features, I think it's clear from the listings (and my own experience) that those don't actually represent very much usage in modern apps (they're only about 1/10 of the items on the Can I Use index of JS APIs). They're certainly not what most people complaining about Safari are actually talking about.

What's especially jarring for me, as a visual journalist, is that the same people who rail against the complexity of the web platform will often praise the interactive stories from teams like mine. While I appreciate the support, I can't help but feel that they think our work is less technically challenging or innovative than a "real" developer's, and that they're happy to have a browser push the envelope only as long as it doesn't pose any competition to Apple's revenue stream.

In contrast, if you look at something like my parents' hometown paper (with an ad-blocker, of course), it's not far off from the "document web" ideal — and it looks unbelievably quaint. Despite the warm glow of nostalgia around "the old web" when men were men, browsers were small, and pages were laid out in tables, actually returning to that standard would feel like trying to use DOS for a day: clumsy, slow, and ugly.

That's why when someone says "browsers should be for pages, not for apps," we should ask specifically what they mean by that:

  • Do they mean physically handing Word files around, like we did before Google Docs? Can anyone imagine going back to a native office suite for any kind of collaboration?
  • Are slippy maps okay, or should we go back to the Mapquest experience of clicking a little arrow and waiting for the page to reload in order to see a little more to the east?
  • Do you want responsive charts in your news articles, so that they're legible on any device? Think of all the COVID explainers and election results from the past year — should all those have been rejected for being "too app-like?"
  • Should a person be able to check their e-mail from any computer, or should they have to install a dedicated native client and remember all their server details?
  • Think about all the infrequent tasks you do online, and now imagine that they're all either regressed to the 1998 version or built as native code. Do you think you should have to install an app just to book a flight? To buy a book? To find a new job?

(Incidentally, it's wild how much the mobile market has been distorted on these issues: I think most people would consider it a total non-starter to need to install a desktop app to read Facebook or stream a TV show, but Apple has worked very hard to protect their platform from browser-based options on mobile.)

I think it's possible for someone to look at that list and still insist that yes, they want browsers to be Gopher clients with slightly better font choices. I personally doubt it, though — I suspect most people making the case for a "document-first" web aren't irrational, they just like the romance of the idea and haven't fully thought it through. I sympathize! That doesn't mean we have to take them seriously.

June 18, 2021

Filed under: tech»coding

Upstream

In 2013, Google decided to shut down Google Reader, one of a number of boneheaded decisions that the company undertook in pursuit of some bizarre competition with Facebook. At the time, I decided to try an experiment: I'd write my own RSS reader, try it for a few months, and if it didn't work out I'd switch to one of the corporate replacement options.

Eight years later, I still use Weir to keep up with various feeds, blogs, and news updates. It's a deeply personal piece of software — the project that made me fall in love with the idea of code tools that are crafted just for a single person, like making your own workbench or sewing your own clothes.

But I also haven't substantially upgraded or altered Weir in all that time, even as I've learned a lot about developing on the web. So this week, while I had the apartment to myself, I decided to experiment again and build a new client (while mostly leaving the server alone). After I get a chance to work out any remaining kinks, I'll move it over to become the new built-in UI for the application.

I love my curvy UI

The original client was written in Angular 1 as a learning project. It's fine! It's mostly fine. The main problem that Angular had — and which other front-end frameworks have inherited — is that it wanted you to do all your work at a level of abstraction from the DOM, and any problems that couldn't be cleanly moved into the state object would get messy. Browsers were also worse in those days: no intersection observers for handling scroll positioning, inconsistent event handling, no support for easy concurrency with async/await. So there's some awkward behavior in the original client that never felt like it would be easy to fix, because it required crossing that abstraction barrier.

Unsurprisingly, for the rewrite I organized the code via web components — extended from the same base class that I used for Radio, and coordinated over a central event bus similar to the command system in Caret. The only code that translated over mostly unchanged was the sanitization module, which loads each post body into an inert document and processes it to remove ads, custom styles, class names, and anything else that isn't plain HTML content.

What is surprising is that the two codebases are not notably different in size — in fact, CLOC gives roughly the same line counts between the two. Of course, that only includes code I wrote. The original Weir client also requires 80KB of Angular runtime code, which has to be downloaded, parsed, compiled, and run before any of my code shows up onscreen. I'm using those precious first-paint seconds to indulge in a build-free workflow — all JS is just loaded as raw ES modules, and components fetch their styles and markup from individual HTML files instead of using Less and Browserify. It all evens out, but if I decide I'm tired of paying a startup penalty, it's certainly easy enough to add Rollup to the process.

Typically when I go framework-less, the thing I miss most is iteration in templates. It's still a little clumsy in the new client code. But combining Element.replaceChildren(), shadow DOM slots, and elements that act as template partials, it's honestly much less of an issue these days. I could add a databinding function to diff and transition elements, as Radio does for its sorted podcast lists, but (other than the feed management table) there's almost no part of the UI here where view data persists between state transitions, so it's not really worth the effort.

Scrolls like the Dead Sea

Instead of using a stack of full-window UI "scenes" for different tasks within the UI (such as settings, feed management, and reading stories), the new client is organized in three columns (admin, story listings, and reader). On desktop, they line up side-by-side across the window, and on mobile each one takes up the whole screen, similar to something like Tweetdeck or Mastodon. CSS scroll snap makes it easy to swipe between them horizontally or scroll vertically within the individual panels as their content requires. In practice this gives us a native-feeling, responsive UI pattern with no JavaScript, and it will feel more natural when snap stop is supported to prevent overscroll.

Desktop view: three columns in a row Weir on desktop

Mobile: swipeable columns (artist's rendering) Weir on a phone

Unfortunately, creating a mobile UI that scrolls in two directions like this means that viewport management is more difficult to handle programmatically. For example, when loading a story into the reader panel, we want to scroll smoothly over to that column from the story list, while immediately jumping within the reader content to the top of the story. In contrast, the story list should scroll smoothly both for its contents (when you use the keyboard shortcuts to select the next item in the list) and when it becomes the primary view on mobile (say, if you reach the end of all unread stories).

Ultimately, the solution was to split scrolling into separate code paths, depending on whether we want to move between columns, or within them. The code still uses scrollIntoView for panel transitions, and modules send a request over the global event bus if they want a different view to take over. The panels themselves are shell custom elements that offer individual control for scrolling content separately from the main viewport — the reader and story list dispatch DOM events up the tree to the ancestor panel when they need their column to scroll vertically to a certain element or offset, with or without an animated transition.

Promises, promises

At the start of the process, I didn't intend to do anything to the server side of Weir. It had already been built to handle cross-domain requests, so I didn't need to change anything for local development, and while it has its quirks, I'm generally pretty happy with how it works. Then I hit a snag: the "mark all as read" API route returns a count of stories that were updated, but not the new unread/total story counts. It was just irritating enough that I decided to dig in and make one little change. Of course it snowballed from there.

Since it was as much a learning experience as it was a legit project, Weir doesn't use a typical Node library for setting up its API. I wrote my own request handler and router on top of the basic HTTP module. That part of the code has actually aged pretty well. However, to manage the async chains involved in making database calls and RSS fetches, I wrote a utility library called Manos (because you're putting your code in the hands of fate), and that stuff was a mess.

These days, the ideal way to handle async flow is with the await keyword, so you don't have to write code out of order or in a snarl of function wrappers. But using await requires functions to return promises instead of accepting callbacks, and all of my code was written before JavaScript promises were standardized. So to make it a little easier to insert a db.getStatus() call in a single handler, I ended up converting the whole application to a promise-based flow.

Luckily, I went through a similar process a few years back with Caret, when async/await shipped in Chrome, so I largely knew what to expect. Surprisingly, the biggest change is not in the routes at all, but in the "Hound" component that periodically fetches feed items from various URLs: subscriptions have to be grouped into batches, then each batch is requested, sometimes decompressed from gzip, fed to a streaming parser, and finally saved to the database. As implemented with Manos, the code was at best out of order, and at worst involved a lot of "clever" functional tricks.

The new Hound flow has its issues — I think there's some leftover weirdness from the way old-school Node streams interact with each other that requires pausing the request as soon as it comes in — but it now reads top to bottom, and most of the complication comes from the problem domain and not the language. At some point, updating the request code to use something like fetch() will probably eliminate most of the remaining issues.

Second-system syndrome

There's a truism in development circles that a rewrite is often a debacle — people point to the rewrite of Netscape 4.0 that's blamed for tanking the company, or the Copland OS at Apple. My personal suspicion is that this is survivorship bias: Netscape itself was a from-scratch rewrite originally from the Mosaic browser, and while Copland was not a success, current Mac OS is built on the bones of NeXT, itself a from-scratch OS.

In any case, most people aren't building browsers or operating systems. For these kinds of small projects, I think there's value in taking another run at an idea, armed with knowledge about what worked or didn't work the first time around. In fact, that might be the best argument for these kinds of small projects (API clients, media players, browser extensions): they're a chance to stop, try something different, and measure our skills against our past selves. I learn a lot from these little rewrites, and I think it's safe to say that I am better at this than I was eight years ago.

I still wish they'd just bring back Reader, though.

February 15, 2021

Filed under: tech»web

Between Amber and Chaos

There isn't, in my opinion, a cooler name for a web standard than the Shadow DOM. The closest runner-up is probably the SubtleCrypto API, and after a decade of Bitcoin the appeal of anything with "crypto" in the name is pretty cloudy. So it's a low bar, but still: Shadow DOM. Pretty cool name.

Although I've been using web components for a long time, I've only been using Shadow DOM with it for a couple of years, in generally in pretty limited ways. For an upcoming project at NPR, I took the chance to really dig into how it's used in a mixed-content environment, one where custom elements are not just leaves of the HTML tree, but also wrap branches of extensive HTML content. The experience was pretty eye-opening, and surprisingly positive!

Walking the pattern

Let's start by talking about what what it is. Like most of the tech under the web components "brand," Shadow DOM is meant to retroactively give developers tools that "explain" what the browser already does, and hook into the same extension points. The goal is to make it possible for regular people to rapidly build out new functionality, because there's no "magic" behind the scenes.

For example, let's create a humble <select> tag:

Right off the bat, this tag has some special treatment that we can't immediately explain through regular HTML: it has a "thumb" (the arrow on the right) that doesn't appear in the DOM and can't be meaningfully styled, but is clearly a UI element that reacts to events. The options, defined as children of the tag, are still surfaced visibly, but not in the same way that children of a paragraph are or a regular text element are. Instead, they're moved to a new location in the dropdown menu and shown conditionally (or, on mobile, through an entirely different UI context).

Using our previous HTML/JS toolkit, it's not possible to duplicate these behaviors, or similar behaviors from tags like <video> or <input type="range">. To explain the "magic" of these elements we need to add Shadow DOM. It gives developers an API to attach a hidden document fragment called a "shadow root" to any given element, which replaces the visible contents of the element. However, even though they're shown to us in the browser, the contents of that document fragment are hidden from normal JavaScript queries, and its CSS styles are isolated — from the inside, you have a blank slate to work from, and from the outside it's as though that shadow content is an intrinsic part of the tag itself, just like the select box's dropdown UI.

What about those select box options, which are written as child tags but appear in a very different way? For that, we add in a <slot> element: inside the shadow, this element will re-parent any children placed in the host element. For example, given a shadow-dom element with the following in its shadow root: <b> SHADOW START </b> <slot></slot> <b> SHADOW END </b>

We could write this in our page as: <shadow-dom> <i>HELLO WORLD</i> </shadow-dom>

The contents of the <i> element aren't shown directly. Instead, they're moved inside the slot element, meaning that the page output will read SHADOW START HELLO WORLD SHADOW END. But, and this is the cool part, that italic tag appears to scripts and dev tools as though it was just a regular child of the <shadow-dom> element — it can be styled as normal, you can query for it, attach event listeners, and edit it as normal. The bold tags, meanwhile, remain in the shadow: they're visible on the page, but they can't be accessed from scripts and their styles are completely isolated.

This, then, is how Shadow DOM "explains" how a select box works. The box itself, including the current item and the thumb UI, live in the shadow. The options you write into the tag are reparented to a slot inside the drop-down area, to be shown when you click the element. We can use this API to create self-contained UI for an application or document without having to worry about new markup or styles polluting the page.

Enter the Logrus

Not everything is rosy, of course. One long-standing complication is that custom elements can't touch their own contents or attributes during construction, for reasons that are tedious and not worth going into here, but they can attach and modify their shadow root. So it's really tempting in custom elements to do everything in a shadow, because it radically simplifies your templating. Now you have null problems. In Radio, I built the entire UI this way, which worked great until I needed to inspect an element that's inside three nested shadow roots, or if I needed to query for the current active element.

Another misunderstanding has been people thinking shadow roots can replace something like Styled Components in terms of style isolation. But Shadow DOM is more like an iframe than anything else: explicitly inherited style properties (like font family) will travel through, but otherwise it's a pretty hard barrier. If you want to provide styling hooks for a component, you need either provide preset options or document a set of CSS custom properties. More importantly, the mechanisms for injecting styles into a shadow root (typically by putting a <style> tag inside) don't play well with standard build tooling.

By contrast, actually populating Shadow DOM tends to be cumbersome without build tooling in place to help. A lot of tutorials recommend building it from an inert <template> tag, which used to be elegantly handled via HTML imports. Now that those are deprecated, you either have to place the Shadow DOM template in your page manually (no), lean into async component definition (awkward), embed the markup into your script as a big JS literal (ugly), or use a build plugin to pull strings in as needed (sigh). None of these are unworkable, or even that difficult, but none of them are nearly as nice as simply being able to define a component's styles, shadow markup, and behavior in a single, imported HTML file.

Major Arcana

My personal feeling is that the biggest barrier to effective Shadow DOM usage, in a lot of cases, is that many developers haven't learned about the browser as much as they've learned about React or another framework, and those frameworks have often diverged in philosophy from the DOM. If you're used to thinking of the page as a JSX function value, the idea of a secret, stateful document fragment that replaces the DOM you tried to render is probably pretty bizarre.

But as someone who writes a lot of minimalist code directly against browser APIs, I actually think Shadow DOM fits in well with my mental model of how elements work, and it has clarified a lot of my thinking on how to build effectively with custom elements — especially through slots and slotted elements.

I'm still learning and experimenting, but I feel comfortable saying that if you're building custom elements, the rule of thumb should be "use Shadow DOM, but not very much." The more you're able to expose HTML to the light DOM by surfacing it through slots, the easier it is to compose them and style content. For example, a custom element that creates a tabbed UI from its children is a great Shadow DOM use case: the tab list lives in the shadow and is generated implicitly by iterating over the slotted elements. Since the actual tab contents are placed back in the light DOM, they're still easy to style and inspect. To really go with the grain of the platform, the host component might show or hide those slotted blocks using the hidden attribute, instead of setting styles or adding classes.

The exception is for elements that should not have children (like input tags) or where children are used for configuration — think video tags or my old Leaflet map component. With these "leaf" components, Shadow DOM lets you treat inner HTML as a domain-specific language, while your visible content lives entirely in the shadow root. That's a great way to create customized behavior, but expose it to designers or novice front-end developers who are very comfortable with markup but would balk at writing a lot of JS.

Ultimately, Shadow DOM feels like it really crystallizes the role of custom elements as a tool for implementing UI widgets, not as a competitor for Svelte. Indeed, by providing a mechanism for moving complex functionality into an opaque facade, it's probably the biggest gift to the "web pages are for documents, not apps" crowd in several years: if you want to build a big single page app, Shadow DOM doesn't really move the needle, but it's great for injecting discrete units of content into an article. As someone who crosses that app/document divide a lot, I'm really excited to see what I can do with it this year.

Past - Present - Future