this space intentionally left blank

March 31, 2022

Filed under: tech»open_source

CTRL alt QMK

Like a lot of people during the pandemic, early last year I got into mechanical keyboard collecting. Once you start, it's an easy hobby to sink a lot of time and money into, but the saving grace is that it's also ridiculously inconvenient even before the supply chain imploded, since everything is a "group buy" or some other micro-production release, so it tends to be fairly self-limiting.

I started off with a Drop CTRL, which is a pretty basic mechanical that serves as a good starting point. Then I picked up a Keychron Q1, a really sharp budget board that convinced me I need more keys than a 75% layout, and finally a NovelKeys NK87 with Box Jade clicky switches, which is just just a lovely piece of hardware and what I'm using to type this.

All three of these keyboards are (very intentionally) compatible with the open-source QMK firmware. QMK is very cool, and ideally it means that any of these keyboards can be extended, customized, and updated in any way I want. For example, I have a toggle set up on each board that turns the middle of the layout into a number pad, for easier spreadsheet edits and 2FA inputs. That's the easy mode — if you really want to dig in and write some C, these keyboards run on ARM chips somewhere on the order of a Nintendo DS, so the sky's pretty much the limit.

That said, "compatible" is a broad term. Both the Q1 and NK87 have full QMK implementations, including support for VIA for live key-remapping and macros, but the CTRL (while technically built on QMK) is usually configured via a web service. It's mostly reliable, but there have been a few times in the last few months where the firmware I got back after remapping keys was buggy or unreliable, and this week I decided I wanted to skip the middleman and get QMK building for the CTRL, including custom lighting.

Well, it could have been easier, that's for sure. In getting the firmware working the way I wanted it, I ended up having to trawl through a bunch of source code and blog posts that always seemed to be missing something I needed. So I decided I'd write up the process I took, before I forget how it went, in case I needed it in the future or if someone else would find it helpful.

Building firmware

The QMK setup process is reasonably well documented--it's a Python package, mostly, wrapped around a compilation toolchain. It'll clone the repo for you and install a qmk command that manages the process. I set mine up on WSL and was up and running pretty quickly.

Once you have the basics going, you need to create a "keymap" variation for your board. In my case, I created a new folder at qmk_firmware/keyboards/massdrop/ctrl/keymaps/thomaswilburn. There are already a bunch of keymaps in there, which is one of the things that gives QMK a kind of ramshackle feel, since they're just additions by randos who had a layout that they like and now everyone gets a copy. Poking around these can be helpful, but they're often either baroque or hyperspecialized (one of them enables the ability to programmatically trigger individual lights from terminal scripts, for example).

However, the neat thing about QMK's setup is that the files in each keymap directory are loaded as "overrides" for the main code. That means you only need to add the files that change for your particular use, and in most cases that means you only need keymap.c and maybe rules.mk. In my case, I copied the default_md folder as the starting place for my setup, which only contains those files. Once that's done, you should be able to test that it builds by running qmk compile -kb massdrop/ctrl -km thomaswilburn (or whatever your folder was named).

Once you have a firmware file, you can send it to the keyboard by using the reset button on the bottom of the board and running Drop's mdloader utility.

Remapping

QMK is designed around the concept of layers, which are arrays of layout config stacked on top of each other. If you're on layer #3 and you press X, the firmware checks its config to see if there's a defined code it should send for that physical key on that layer. QMK can also have a slot defined as "transparent," which means that if there's not a code assigned on the current layer, it will check the next one down, until it runs out. So, for example, my "number pad" layer defines U as 4, I as 5, and so on, but most of the keys are transparent, so pressing Home or End will fall through and do the right thing, which saves time having to duplicate all the basic keys across layers.

If your board supports VIA, remapping the layer assignments is easy to do in software, and your keymap file will just contain mostly empty layers. But since the CTRL doesn't support VIA, you have to assign them manually in C code. Luckily, the default keymap has the basics all set up, as well as a template for an all-transparent layer that you can just copy and paste to add new ones. You can see my layer assignments here. The _______ spaces are transparent, and XXXXXXX means "do nothing."

There's a full list of keycodes in the QMK docs, including a list of their OS compatibility (MacOS, for example, has a weird relationship with things like "number lock"). Particularly interesting to me are some of the combos, such as LT(3, KC_CAPS), which means "switch to layer three if held, but toggle caps lock if tapped." I'm not big on baroque chord combinations, but you can make the extended functions a lot more convenient by taking advantage of these special layer behaviors.

Ultimately, my layers are pretty straightforward: layer 0 is the standard keyboard functions. Layer 1 is fully transparent, and is just used to easily toggle the lighting effects off and on. Layer 2 is number pad mode, and Layer 3 triggers special keyboard behaviors, like changing the animation pattern or putting it into "firmware flash" mode.

Lighting

Getting the firmware compiling was pretty easy, but for some reason I could not get the LED lighting configuration to work. It turns out that there was a pretty silly answer for this. We'll come back to it. First, we should talk about how lights are set up on the CTRL.

There are 119 LEDs on the CTRL board: 87 for the keys, and then 32 in a ring around the edges to provide underglow. These are addressed in the QMK keymap file using a legacy system that newer keyboards eschew, I think because it was easier for Drop to build their web config tool around the older syntax. I like the new setup, which lets you explicitly specify ranges in a human-readable way, but the Drop method isn't that much more difficult.

Essentially, the keymap file should set up an array called led_instructions filled with C structs configuring the LED system, which you can see in my file here. If you don't write a lot of C, the notation for the array may be unfamiliar, but these unordered structs aren't too difficult from, say, JavaScript objects, except that the property names have to start with a dot. Each one gets evaluated in turn for each LED, and a set of flags tells QMK what conditions it requires to activate and what it does. These flags are:

  • LED_FLAG_USE_PATTERN - indicates that you're going to set a specific pattern by index from the set of different animations that the CTRL ships by default. For example, .pattern = 3 should activate the teal/salmon gradient.
  • LED_FLAG_USE_ROTATE_PATTERN - indicates that you want to use the user-selectable pattern, which the user can switch between using hotkeys.
  • LED_FLAG_USE_RGB - indicates that instead of using a preset color or pattern, you'll provide custom RGB values for the LEDs.
  • LED_FLAG_MATCH_LAYER - will only apply this lighting when the current layer matches the provided index.
  • LED_FLAG_MATCH_ID - will only apply this lighting to LEDs matching an ID bitmask.
Combining these gives you a lot of flexibility. For example, let's say I want to light up the keys in the "number pad" (7-9, U-O, J-L, and M-period) in bright green when layer #2 is active. For that case, the struct looks something like this:
{
  .flags = LED_FLAG_MATCH_LAYER | 
    LED_FLAG_USE_RGB | 
    LED_FLAG_MATCH_ID,
  .g = 255, 
  .id0 = 0x03800000,
  .id1 = 0x0E000700,
  .id2 = 0xFF8001C0,
  .id3 = 0x00FFFFFF,
  .layer = 2
},
The flags mean that this will only apply when the active layer matches the .layer property, we're going to provide color byte values (just .g in this case, since the red and blue values are both zero), and only LEDs matching the bitmask in .id0 through .id3 will be affected.

Most of this is human-readable, but those IDs are a pain. They are effectively a bitmask of four 32-bit integers, where each bit corresponds to an LED on the board, starting from the escape key (id 0) and moving left-to-right through each row until you get to the right arrow in the bottom-right of the keyboard (id 86), and then proceeding clockwise all around the edge of the keyboard. So for example, to turn the leftmost keys on the keyboard, you'd take their IDs (0 for escape, 16 for `, 35 for tab, 50 for capslock, 63 for left shift, and 76 for left control), divide by 32 to find out which .idX value you want, and then modulo 32 to set the correct bit within that integer (in this case, the result is 0x00010001 0x80040002 0x00001000). That's not fun!

Other people who have done this have used a Python script that requires you to manually input the LED numbers, but I'm a web developer. So I wrote a quick GUI for creating the IDs for a given lighting pattern: click to toggle a key, and when the diagram is focused you can also press physical keys on your keyboard to quickly flip them off and on. The input contains the four ID integers that the CTRL expects when using the LED_FLAG_MATCH_ID option.

Using this utility script, it was easy to set up a few LED zones in a Vilebloom theme that, for me, evokes the classic PDP/11 console. But as I mentioned before, when I first started configuring the LED system, I couldn't get anything to show up. Everything compiled and loaded, and layers worked, but no lights appeared.

What I eventually realized, to my chagrin, was that the brightness was turned all the way down. Self-compiled QMK tries to load settings from persistent memory, including the active LED pattern and brightness, but I suspect the Drop firmware doesn't save them, so those addresses were zero. After I used the function keys to increase the backlight intensity, everything worked great.

In review

As a starter kit, the CTRL is pretty good. It's light but solidly constructed with an aluminum case, relatively inexpensive, and it has a second USB-C port if you want to daisy-chain something else off it. It's a good option if you want to play around with some different switch options (I added Halo Clears, which are pingy but have the same satisfying snap as that one Nokia phone from The Matrix).

It's also weirdly power-hungry, the integrated plate means it's stiff and hard to dampen acoustically, it only takes 3-prong switches, and Drop's software engineering seems to be stretched a little thin. So it's definitely a keyboard that you can grow beyond. But I'm glad I put the time into getting the actual open source firmware working — at the very least, it can be a fun board for experimenting with layouts and effects. And if you're hoping to stretch it a little further than its budget roots, I hope the above information is useful.

March 7, 2020

Filed under: tech»open_source

Call Me Al

With Super Tuesday wrapped up, I feel pretty confident in writing about Betty, the new ArchieML parser that powered NPR's new election liveblogs. Language parsers are a pretty fundamental computer science discipline, which of course means that I never formally learned about them. Betty isn't a very advanced parser, compared to something that can handle a real programming language, but it's still pretty neat — and you can't say it's not battle-tested, given the tens of thousands of concurrent readers who unknowingly consumed its output last week.

ArchieML is a markup language created at the New York Times a few years back. It's designed to be easy to learn, error-tolerant, and well-suited to simultaneous editing in Google Docs. I've used it for several bigger story projects, and like it well enough. There are some genuinely smart features in there, and on a slower development cycle, it's easy enough to hand-fix any document bugs that come up.

Unfortunately, in the context of the NPR liveblog system, which deploys updated content on a constant loop, the original ArchieML had some weaknesses that weren't immediately obvious. For example, its system for marking up multi-line strings — signalling them with an :end token — proved fragile in the face of reporters and editors who were typing as fast as they could into a shared document. ArchieML's key-value syntax is identical to common journalistic structures like Sanders: 1,000, which would accidentally turn what the reporter thought was an itemized list into unexpected new data fields and an empty post body. I was spending a lot of time writing document pre-processors using regular expressions to try to catch errors at the input level, instead of processing them at the data level, where it would make sense.

To fix these errors, I wanted to introduce a more explicit multi-line string syntax, as well as offer hooks for input validation and transformation (for example, a way to convert the default string values into native types during parsing). My original impulse was to patch the module offered by the Times to add these features, but it turned out to be more difficult than I'd thought:

  • The parser used many, many regular expressions to process individual lines, which made it more difficult to "read" what it was doing at any given time, or to add to the parsing in an extensible way.
  • It also used a lazy buffering strategy in a single pass, in which different values were added to the output only when the next key was encountered in the document, which made for short but extremely dense code.
  • Finally, it relied on a lot of global scope and nested conditionals in a way that seemed dangerous. In the worst case you'd end up with a condition like !isSkipping && arrayElement.exec(input) && stackScope && stackScope.array && (stackScope.arrayType !== 'complex' && stackScope.arrayType !== 'freeform') && stackScope.flags.indexOf('+') < 0, which I did not particularly want to untangle.

Okay, I thought, how hard can it be to write my own parser? I was a fool. Four days later, I emerged from a trance state with Betty, which manages to pass all the original tests in the repo as well as some of my own for my new syntax. I'm also much more confident in our ability to maintain and patch Betty over time (the ArchieML module on NPM hasn't been updated since 2016).

Betty (who Wikipedia tells me was the mechanic in the comics, appropriately enough) is about twice as large as the original parser was. That size comes from the additional structure in its design: instead of a single pass through the text, Betty builds the final output from three escalating passes.

  1. The tokenizer runs through the document character by character, and outputs a stream of tokens containing one or more characters of text bucketed into different types.
  2. The parser takes that stream of tokens, reassembles them into lines (as is required by the ArchieML spec), and then matches those against syntax patterns to emit a list of assembly instructions (such as setting a key, creating an array, or buffering text).
  3. Finally, the assembler runs the instructions through a final cleanup (consolidating and trimming values) and then uses them to build the output object.

Essentially, Betty trades concision for clarity: during debugging, it was handy to be able to look at the intermediate outputs of each stage to see where something went wrong. Each pipeline section is also much more readable, since it only needs to be concerned with one stage of the process, so it uses less global state and does less bookkeeping. The parser, for example, doesn't need to worry about the current object scope or array types, but can simply defer those to the assembler.

But the real wins are the simplicity of adding new syntax to ArchieML, in ways that the original parser was not extensible. Our new multi-line type means that editors and reporters can write plain English in posts and not have to worry about colliding with the document syntax in unexpected ways. Switching to Betty cleaned up the liveblog code substantially, since we can also take advantage of the assembler's pipeline hooks: keys are automatically camel-cased (Google Docs likes to sentence-case keys if you're not careful), and values can be converted automatically to JavaScript Date objects or numbers at the earliest stage, rather than during output or templating.

If you'd told me a few years ago that I'd be writing something this complicated, I would have been extremely surprised. I don't have a formal CS background, nor did I ever want one. Parsing is often seen as black magic by self-taught developers. But as I have argued in the past, being able to write even simple parsers is an incredibly valuable skill for data journalism, where odd or proprietary data formats are not uncommon. I hope Betty will not just be a useful library for my work at NPR, but also a valuable teaching tool in the community.

November 20, 2019

Filed under: tech»open_source

Repackaged apps

Earlier this week, a member of the Google developer relations team ported Caret to the web. He's actually the second person from Chrome to do this — a member of the browser team created a separate port last month. The reasons for this are simple: Caret is a complete application with a relatively small API surface, most of which revolves around file I/O. Chrome has recently rolled out trial support for the Native Filesystem API, which lets web apps open and edit local files. So it's an ideal test case.

I want to be clear, Google's not doing anything wrong here. Caret is licensed under the GPL, which means pretty much anyone can take it and do whatever they want, as long as they give me credit for the code I wrote and distribute the source, both of which are happening here. They haven't been rude about it (Ben, the earlier developer, very kindly reached out to me first), and even if they were, I couldn't stop it. I intentionally made that decision early on with Caret, because I believe giving the code away for something as fundamental as a text editor is the right thing to do.

That said, my feelings about these ports are extremely mixed.

On the one hand, after a half-decade of semi-active development, Caret has found a nice audience among students and amateur hackers. If it's possible to expand that audience — to use Google's market power to give more students, and more amateurs, the tools to realize their own goals — that's an exciting possibility.

But let's be clear: the reason why a port is necessary is because Google has been slowly removing support for Chrome Apps like Caret from their browser, in favor of active development on progressive web apps. After building on their platform and watching them strip support away from my users on Windows and OS X, with the clear intention of eventually removing it from Chrome OS after its Android support is advanced enough, I'm not particularly thrilled about the idea of using it to push PR for new APIs in Chrome (no other browsers have announced support for Native Filesystem).

People have ported Caret before. But it feels very different when it's a random person who wants to add a particular feature, versus a giant tech corporation with a tremendous amount of power and influence. If Google wants to become the new "owner" of Caret, they're perfectly capable of it. And there's nothing I can do to stop them. Whether they're going to do this or not (I'm pretty sure they won't) doesn't stop my heart from skipping a beat when I think about it. The power gradient here is unsettling.

Lately, a group of journalism students at Northwestern University here in Illinois came under fire for an apology for and retraction of their coverage of protests against former Attorney General Jeff Sessions. This includes the usual suspects, like Bari Weiss, the NYT columnist who regularly publishes columns in the biggest paper in the world about how she's being silenced by critics, but also a number of legitimate journalists concerned about self-censorship. But the editorial itself is quite clear on why they took this step, including one telling paragraph:

We also wanted to explain our choice to remove the name of a protester initially quoted in our article on the protest. Any information The Daily provides about the protest can be used against the participating students — while some universities grant amnesty to student protesters, Northwestern does not. We did not want to play a role in any disciplinary action that could be taken by the University. Some students have also faced threats for being sources in articles published by other outlets. When the source in our article requested their name be removed, we chose to respect the student’s concerns for their privacy and safety. As a campus newspaper covering a student body that can be very easily and directly hurt by the University, we must operate differently than a professional publication in these circumstances.

You may disagree with the idea that journalists should take down or adjust coverage of public events and persons, but it is legitimately more complicated than just "liberal snowflakes bowing to public pressure." No-one is debating that the reporters can take pictures of public protests, or publish the names of those involved. But should they? Likewise, when a newsroom's community is upset about coverage, editors can ignore the outcry, or respond with scorn. It shouldn't be surprising that certain audiences turn away or become distrustful of a paper that does so.

The relationships in this situation, as with various ports of Caret, are complicated by power. In both cases, what would be permissible or normal in one context is changed by the power differential of the parties involved, whether that's students to the paper to the university, or me to Google, or data journalists to the people in their FOIA requests, or tech workers to their employers' government contracts.

Most newsrooms don't think very much about power, in my experience, or they think of it as something they're supposed to check, not something they possess. But we need to take responsibility for our own power. It's possible that the students at the Daily Northwestern overreacted — if you protest in public, you should probably expect that pictures are going to be taken — but they're at least engaging with the question of what to do with the power they wield (directly and, in the case of the university's discipline system, indirectly). Using power in ways that have a real chance of harming your readers, just on principle and the idea that "that's what journalists do," is tautocracy at work.

As much as anything, I think this is one of the key generational shifts taking place in both software and journalism. My own sympathies tend toward a vision of both that prioritizes harm reduction over abstractions like "free speech" or "intellectual property," but I don't have any pat answers. Similarly, I've become acclimated to the idea of a web-based Caret port that's out of my hands, because I think the benefits to users outweigh the frustration I feel personally. I can't do anything about it now. But I will definitely learn from this experience, and it will change how I plan future projects.

Past - Present