this space intentionally left blank

November 20, 2019

Filed under: tech»open_source

Repackaged apps

Earlier this week, a member of the Google developer relations team ported Caret to the web. He's actually the second person from Chrome to do this — a member of the browser team created a separate port last month. The reasons for this are simple: Caret is a complete application with a relatively small API surface, most of which revolves around file I/O. Chrome has recently rolled out trial support for the Native Filesystem API, which lets web apps open and edit local files. So it's an ideal test case.

I want to be clear, Google's not doing anything wrong here. Caret is licensed under the GPL, which means pretty much anyone can take it and do whatever they want, as long as they give me credit for the code I wrote and distribute the source, both of which are happening here. They haven't been rude about it (Ben, the earlier developer, very kindly reached out to me first), and even if they were, I couldn't stop it. I intentionally made that decision early on with Caret, because I believe giving the code away for something as fundamental as a text editor is the right thing to do.

That said, my feelings about these ports are extremely mixed.

On the one hand, after a half-decade of semi-active development, Caret has found a nice audience among students and amateur hackers. If it's possible to expand that audience — to use Google's market power to give more students, and more amateurs, the tools to realize their own goals — that's an exciting possibility.

But let's be clear: the reason why a port is necessary is because Google has been slowly removing support for Chrome Apps like Caret from their browser, in favor of active development on progressive web apps. After building on their platform and watching them strip support away from my users on Windows and OS X, with the clear intention of eventually removing it from Chrome OS after its Android support is advanced enough, I'm not particularly thrilled about the idea of using it to push PR for new APIs in Chrome (no other browsers have announced support for Native Filesystem).

People have ported Caret before. But it feels very different when it's a random person who wants to add a particular feature, versus a giant tech corporation with a tremendous amount of power and influence. If Google wants to become the new "owner" of Caret, they're perfectly capable of it. And there's nothing I can do to stop them. Whether they're going to do this or not (I'm pretty sure they won't) doesn't stop my heart from skipping a beat when I think about it. The power gradient here is unsettling.

Lately, a group of journalism students at Northwestern University here in Illinois came under fire for an apology for and retraction of their coverage of protests against former Attorney General Jeff Sessions. This includes the usual suspects, like Bari Weiss, the NYT columnist who regularly publishes columns in the biggest paper in the world about how she's being silenced by critics, but also a number of legitimate journalists concerned about self-censorship. But the editorial itself is quite clear on why they took this step, including one telling paragraph:

We also wanted to explain our choice to remove the name of a protester initially quoted in our article on the protest. Any information The Daily provides about the protest can be used against the participating students — while some universities grant amnesty to student protesters, Northwestern does not. We did not want to play a role in any disciplinary action that could be taken by the University. Some students have also faced threats for being sources in articles published by other outlets. When the source in our article requested their name be removed, we chose to respect the student’s concerns for their privacy and safety. As a campus newspaper covering a student body that can be very easily and directly hurt by the University, we must operate differently than a professional publication in these circumstances.

You may disagree with the idea that journalists should take down or adjust coverage of public events and persons, but it is legitimately more complicated than just "liberal snowflakes bowing to public pressure." No-one is debating that the reporters can take pictures of public protests, or publish the names of those involved. But should they? Likewise, when a newsroom's community is upset about coverage, editors can ignore the outcry, or respond with scorn. It shouldn't be surprising that certain audiences turn away or become distrustful of a paper that does so.

The relationships in this situation, as with various ports of Caret, are complicated by power. In both cases, what would be permissible or normal in one context is changed by the power differential of the parties involved, whether that's students to the paper to the university, or me to Google, or data journalists to the people in their FOIA requests, or tech workers to their employers' government contracts.

Most newsrooms don't think very much about power, in my experience, or they think of it as something they're supposed to check, not something they possess. But we need to take responsibility for our own power. It's possible that the students at the Daily Northwestern overreacted — if you protest in public, you should probably expect that pictures are going to be taken — but they're at least engaging with the question of what to do with the power they wield (directly and, in the case of the university's discipline system, indirectly). Using power in ways that have a real chance of harming your readers, just on principle and the idea that "that's what journalists do," is tautocracy at work.

As much as anything, I think this is one of the key generational shifts taking place in both software and journalism. My own sympathies tend toward a vision of both that prioritizes harm reduction over abstractions like "free speech" or "intellectual property," but I don't have any pat answers. Similarly, I've become acclimated to the idea of a web-based Caret port that's out of my hands, because I think the benefits to users outweigh the frustration I feel personally. I can't do anything about it now. But I will definitely learn from this experience, and it will change how I plan future projects.

November 1, 2019

Filed under: movies»commentary»horror

Wake me up when Shocktober ends

When I was a kid in Lexington, Kentucky, I remember that grocery stores would have a little video rental section at the front of the store, just a few shelves stocked with VHS tapes. I used to be fascinated by the horror movies: when my parents were checking out, I would often walk over and look at the box art, which had its own special, lurid appeal. It was the age of golden plasticky, rubbery practical effects. I could have stared at the cover for Ghoulies for hours, wondering what the movie inside was like.

This year, for the first time, I decided to celebrate Shocktober: watching a horror movie for every day in the month before Halloween. In particular, I tried to watch a lot of the movies my 7-year-old self would have wanted to see. It turns out that these were not generally very good! My full list is below, with the standouts in bold.

  1. Children of the Corn
  2. Nightmare on Elm Street (2010)
  3. Green Room
  4. We Have Always Lived In The Castle
  5. Ma
  6. The Conjuring
  7. Pumpkinhead
  8. Halloween 2
  9. Hellraiser
  10. Black Christmas
  11. Insidious
  12. Doom: Annihilation
  13. Candyman
  14. Little Evil
  15. Cam
  16. Chopping Mall
  17. House (1986)
  18. Creep (2014)
  19. The Perfection
  20. They Wait
  21. My Bloody Valentine (1981)
  22. Ginger Snaps
  23. The Gate
  24. Prophecy
  25. Halloween 3
  26. In the Tall Grass
  27. Head Count
  28. 1922
  29. Emelie
  30. Train to Busan
  31. The Babysitter
  32. The Ring

One thing that becomes obvious very quickly is how inconsistent the horror genre is: not only is it extremely prone to fashion, but also to drought. The mid-to-late 80s had a lot of real stinkers — either "comedy" horror like House, nonsense slashers like My Bloody Valentine, or just mistakes (Children of the Corn, which is amateurish on almost every level). I suspect this parallels a lot of the CG goofball period of the late 2000s (Darkness Falls Hollow Man, They).

On the other hand, there are some real classics in there. Black Christmas predates Halloween by four years, and not only probably inspired it but is also a much better movie: more interesting characters, better sense of place, and a wild Pelham 123-style investigation. Candyman and Hellraiser are both fascinating, complicated movies packed with indelible imagery. And Halloween 3 manages to feel like a companion piece to They Live, trading all connection to the mainline series for a bizarre riff on media paranoia.

Somewhere in the middle is Chopping Mall, a movie that's somehow so terrible, so perfectly 1986, that it becomes compulsively watchable. Its effects are bad, the characters are thinly drawn and largely there for gratuitous nudity, and its marketing materials wildly overpromise what it will deliver. It's perfect, I love it, and I name it the official movie of Shocktober 2019.

May 21, 2019

Filed under: tech»web

Radios Hack

The past few months, I've mostly been writing in public for NPR's News Apps team blog, with posts on the new Dailygraphics rig (and setting it up on Windows), the Mueller report redactions, and building a scrolling audio story. However, in my personal time, I decided to listen to some podcasts. So naturally, I built a web-based listener app, just for me.

I had a few goals for Radio as I was building it. The first was my own personal use case, in which I wanted to track and listen to a few podcasts, but not actually install a dedicated player on my phone. A web app makes perfect sense for this kind of ephemeral use case, especially since I'm not ever really offline anymore. I also wanted to try building something entirely using Web Components instead of a UI framework, and to use modern features like import — in part because I wanted to see if I could recommend it as a standard workflow for younger developers, and for internal newsroom tools.

Was it a success? Well, I've been using it to listen to Waypoint and Says Who for the last couple of months, so I'd say on that metric it proved itself. And I certainly learned a lot. There are definitely parts of this experience that I can whole-heartedly recommend — importing JavaScript modules instead of using a bundler is an amazing experience, and is the right kind of tradeoff for the health of the open web. It's slower (equivalent to dynamic AMD imports) but fast enough for most applications, and it lets many projects opt entirely out of beginner-unfriendly tooling.

Not everything was as smooth. I'm on record for years as a huge fan of Web Components, particularly custom elements. And for an application of this size, it was a reasonably good experience. I wrote a base class that automated some of the rough edges, like templating and synchronizing element attributes and properties. But for anything bigger or more complex, there are some cases where the platform feels lacking — or even sometimes actively hostile.

For example: in the modern, V1 spec for custom elements, they're not allowed to manipulate their own contents until they've been placed in the page. If you want to skip the extra bookkeeping that would require, you are allowed to create a shadow root in the constructor and put whatever HTML you want inside. It feels very much like this is the workflow you're supposed to use. But shadow DOM is harder to inspect at the moment (browser tools tend to leave it collapsed when inspecting the page), and it causes problems with events (which do not cross the shadow DOM boundary unless you alter your dispatch code).

There's also no equivalent in Web Components for the state management that's core to most modern frameworks. If you want to pass information down to child components from the parent, it either needs to be set through attributes (meaning you only get strings) or properties (more bookkeeping during the render step). I suspect if I were building something larger than this simple list-of-lists, I'd want to add something like Redux to manage data. This would tie my components to that framework, but it would substantially clean up the code.

Ironically, the biggest hassle in the development process was not from a new browser feature, but from a very old one: while it's very easy to create an audio tag and set its source to any sound clip on the web, actually getting the list of audio files is often impossible, thanks to CORS. Most podcasts do not publish their episode feeds with the cross-origin header set, so the browser's security settings shut down the AJAX requests completely. It's wild that in 2019, there's still no good way to make a secure request (say, one that transmits no cookies or custom headers) to another domain. I ended up running the final app on Glitch, which provides basic Node hosting, so that I could deploy a simple proxy for feed data.

For me, the neat thing about this project was how it brought back the feeling of hackability on the web, something I haven't really felt since I first built Caret years ago. It's so easy to get something spun up this way, and that's a huge incentive for creating little personal apps. I love being able to make an ugly little app for myself in only a few hours, instead of needing to evaluate between a bunch of native apps run by people I don't entirely trust. And I really appreciated the ways that Glitch made that easy to do, and emphasized that in its design. It helps that podcasting, so far, is still a platform built on open web tech: XML and MP3. More of this, please!

March 21, 2019

Filed under: journalism»ethics

Reporting through the Scramble Suit

A proposal for responsible and ethical publication of personally-identifiable information in data journalism

Thanks to Helga Salinas, Kazi Awal, and Audrey Carlsen for their feedback.

Introduction

Over the last decade, one of the goals of data journalism has been to increase accountability and transparency through the release of raw data. Admonitions of "show your work" have become common enough that academics judge our work by the datasets we link to. These goals were admirable, and (in the context of legitimizing data teams within legacy organizations) even necessary at the time. But in an age of 8chan, Gamergate, and the rise of violent white nationalism, it may be time to add nuance to our approach.

This document is concerned primarily with the publication of personal data (also known as personally-identifiable information, or PII). In other words, we're talking about names, addresses or contact info, lat/long coordinates and other geodata, ID numbers (including license plates or other government ID), and other data points that can be traced back to a single individual. Much of this is available already under the public record, but that's no excuse: as the NYT Editorial Board wrote in 2018, "just because information is public doesn't mean it has to be so easy for so many people to get." It is irresponsible to amplify information without thinking about what we're amplifying and why.

Moreover, this is not a theoretical discussion: many newsroom projects start with large-scale FOIA dumps or public databases, which may include exactly this personal data. There have been movements in recent years to monetize these databases--creating a queryable database of government salaries, for example, and offering it via a subscription. Even random public records requests may disclose personal data. Intentionally or not, we're swimming in this stuff, and have become jaded as to its prevalence. I simply ask: is it right for us to simply push it out, without re-examining the implications of doing so?

I would stress that I'm not the only person who has thought about these things, and there are a few signs that we as an industry are beginning to formalize our thought process in the same way that we have standards around traditional reporting:

  • The Markup's ethics policy contains guidelines on personal data, including a requirement to set an expiration date (after which point it is deleted).
  • Reveal's ethics guide doesn't contain specific data guidelines, but does call out the need to protect individual privacy: "Recognize that private people have a greater right to control information about themselves than do public officials and others who seek power, influence or attention. Only an overriding public need can justify intrusion into anyone's privacy."
  • The New York Times ran a session at NICAR 2019 on "doxxing yourself," in part to raise awareness of how vulnerable reporters (and by extension, readers) may be to targeted harassment and tracking.
  • A 2016 SRCCON session on "You're The Reason My Name Is On Google: The Ethics Of Publishing Public Data" explored real-world lessons from the Texas Tribune's salary databases (transcript here).

Why the concern?

In her landmark 2015 book The Internet of Garbage, Sarah Jeong sets aside an entire chapter just for harassment. And with good reason: the Internet has enabled new innovations for old prejudices, including SWATting, doxing, and targeted threats at a new kind of scale. Writing about Gamergate, she notes that the action of its instigator, Eron Gjoni, "was both complicated and simple, old and new. He had managed to crowdsource domestic abuse."

I choose to talk about harassment here because I think it provides an easy touchstone for the potential dangers of publishing personal information. Since Latanya Sweeney's initial work on de-anonymizing data, an entire industry has grown up around taking disparate pieces of information, both public and private, and matching them against each other to create alarmingly-detailed profiles of individual people. It's the foundation of the business model for Facebook, as well as a broad swathe of other technology companies. This information includes your location over time. And it's available for purchase, relatively cheaply, by anyone who wants to target you or your family. Should we contribute, even in a minor way, to that ecosystem?

These may seem like distant or abstract risks, but that may be because for many of us, this harassment is more distant or abstract than it is for others. A survey of "news nerds" in 2017 found that more than half are male, and three-quarters are white (a demographic that includes myself). As a result of this background, many newsrooms have a serious blind spot when it comes to understanding how their work may be seen (or used against) underrepresented populations.

As numerous examples have shown, we are very bad as an industry at thinking about how our power to amplify and focus attention is used. Even if harassment is not the ultimate result, publishing personal data may be seen by our audience as creepy or intrusive. At a time when we are concerned with trust in media, and when that trust is under attack from the top levels of government, perhaps we should be more careful in what data we publish, and how.

Finally, I think it is useful to consider our twin relationship to power and shame. Although we don't often think of it this way, the latter is often a powerful tool in our investigative reporting. After all, as the fourth estate, we do not have the power to prosecute or create legislation. What we can do is highlight the contrast between the world as we want it to be and as it actually is, and that gulf is expressed through shame.

The difference between tabloid reporting and "legitimate"journalism is the direction that shame is directed. The latter targets its shame toward the powerful, while the former is as likely to shame the powerless. In terms of accountability, it orients our power against the system, not toward individual people. It's the difference between reporting on welfare recipients buying marijuana, as opposed to looking at how marijuana licensing perpetuates historical inequalities from the drug war.

Our audiences may not consciously understand the role that shame plays in our journalism, but they know it's a part of the work. They know we don't do investigations in order to hand out compliments and community service awards. When we choose to put the names of individuals next to our reporting, we may be doing it for a variety of good reasons (perhaps we worked hard for that data, or sued to get it) but we should be aware that it is often seen as an implication of guilt on the part of the people within.

Potential guidelines for public release of PII

I want to be very clear that I am only talking about the public release of data in this document. I am not arguing that we should not submit FOIA or public records requests for personal data, or that it can't be useful for reporting. I'm also not arguing that we should not distribute this data at all, in aggregated form, on request, or through inter-organizational channels. It is important for us to show our work, and to provide transparency. I'm simply arguing that we don't always need to release raw data containing personal information directly to the public.

In the spirit of Maciej Ceglowski's Haunted by Data, I'd like to propose we think of personal data in three escalating levels of caution:

Don't collect it!

When creating our own datasets, it may be best to avoid personal data in the first place. Remember, you don't have to think about the implications of the GDPR or data leaks if you never have that information. When designing forms for story call-outs, try to find ways to automatically aggregate or avoid collecting information that you're not going to use during reporting anyway.

Don't dump it!

If you have the raw data, don't just throw it out into the public eye because you can. In general, we don't work with raw data for reporting anyway: we work with aggregates or subsets, because that's where the best stories live. What's the difference in policy effects between population groups? What department has the widest salary range in a city government? Where did a disaster cause the most damage? Releasing data in an aggregate form still allows end-users to check your work or perform follow-ups. And you can make the full dataset available if people reach out to you specifically over e-mail or secure channels (but you'll be surprised how few actually do).

Don't leave it raw!

In cases where distributing individual rows of data is something you're committed to doing, consider ways to protect the people inside the data by anonymizing it, without removing its potential usefulness. For example, one approach that I love from ProPublica Illinois' parking ticket data is the use of one-way hash functions to create consistent (but anonymous) identifiers from license plates: the input always creates the same output, so you can still aggregate by a particular car, but you can't turn that random-looking string of numbers and letters back into an actual license plate. As opposed to "cooking" the data, we can think of this as "seasoning" it, much as we would "salt" a hash function. A similar approach was used in the infosec community in 2016 to identify and confirm sexual abusers in public without actually posting their names (and thus opening the victims up to retaliation).

Toward a kinder, more empathic data journalism

Once upon a time, this industry thought of computer-assisted reporting as a new kind of neutral standard: "precision" or "scientific" journalism. Yet as Catherine D'Ignazio and Lauren Klein point out in Data Feminism, CAR is not neutral, and neither is the way that the underlying data is collected, visualized, and distributed. Instead, like all journalism, it is affected by concerns of race, gender, sexual identity, class, and justice.

It's my hope that this proposal can be a small step to raise the profile of these questions, particularly in legacy newsrooms and journalism schools. In working on several projects at The Seattle Times and NPR, I was surprised to find that although there are guidelines on how to ethically source and process data, it was difficult to find formal advice on ethical publishing of that same data. Other journalists have certainly dealt with this, and yet there are relatively few documents that lay out concrete guidelines on the matter. We can, and should, change that.

March 7, 2019

Filed under: movies»commentary»scifi

Stop trying to hit me

At the end of this month, in keeping with the horrifying march of time, The Matrix turns 20 years old. It's hard to overstate how mind-blowing it was for me, a high-schooler at the time, when the Wachowski sisters' now-classic marched into theaters: combining entirely new effects techniques with Hong Kong wire-work martial arts, it's still a stylish and mesmerizing tour de force.

The sequels... are not. Indeed, little of the Wachowski's post-Matrix output has been great, although there's certainly a die-hard contingent that argues for Speed Racer and Sense8. But in rewatching them this month, I've been struck by the ways that Reloaded and Revolutions almost feel like the work of entirely different filmmakers, ones who have thrown away one of their most powerful storytelling tools. By that, I mean the fight scenes.

The Matrix has a few set-piece fight scenes, and they're not all golden. The lobby gunfight, for example, doesn't hold up nearly as well on rewatch. But at their best, the movie's action segments deftly thread a needle between "cool to watch" and "actively communicating plot." Take, for example, the opening chase between Trinity, some hapless cops, and a pair of agents:

In a few minutes, we learn that A) Trinity is unbelievably dangerous, and B) however competent she is, she's utterly terrified by the agents. We also start to see hints of their character: one side engaged in agile, skilled hit-and-run tactics, while the authorities bully through on raw power. And we get the sense that while there are powers at work here, it's not the domain of magic spells. Instead, Trinity's escape bends the laws of time and space — in a real way, to be able to manipulate the Matrix is to be able to control the camera itself.

But speaking of rules that are can be bent or broken, we soon get to the famous dojo training sequence:

I love the over-the-top kung fu poses that start each exchange, since they're such a neat little way of expressing Neo's distinct emotional progress through the scene: nervousness, overconfidence, determination, fear, self-doubt, and finally awareness. Fishburne absolutely sells his lines ("You think that's air you're breathing now...?"), but the dialog itself is almost superfluous.

The trash-as-tumbleweed is a nice touch to start the last big brawl of the movie, as is the Terminator-esque destruction of Smith's sunglasses. But pay close attention to the specific choreography here: Smith's movements are, again, all power and no technique. During the fight, he hardly even blocks, and there aren't any fancy flips or kicks. But halfway through, after the first big knock-down, Neo starts to use the agent's own attack routines against him, while adding his own improvisations and style at the end of each sequence. One of these characters is dynamic and flexible, and one of them is... well, a machine. We're starting to see the way that the ending will unfold, right here.

What do all these fight scenes have in common? Why are they so good? Well, in part, they're about creating a readable narrative for each character in the shot, driving their action based on the emotional needs of a few distinct participants. Yuen Woo Ping is a master at this — it's practically the defining feature of Crouching Tiger, Hidden Dragon, on which he did fight direction a year later and in which almost every scene combines character and action almost seamlessly. Tom Breihan compares it to the role that song-and-dance numbers play in a musical in his History of Violence series, and he's absolutely right. Even without subtitles or knowledge of Mandarin, this scene is beautifully eloquent:

By contrast, three years later, The Matrix Reloaded made its centerpiece the so-called "burly brawl," in which a hundred Agent Smiths swarm Neo in an empty lot:

The tech wasn't there for the fight the Wachowskis wanted to show — digital Keanu is plasticky and weirdly out-of-proportion, while Hugo Weaving's dopplegangers only get a couple of expressions — but even if they had modern, Marvel-era rendering, this still wouldn't be a satisfying scene. With so many ambiguous opponents, we're unable to learn anything about Neo or Smith here. There's no mental growth or relationship between two people — just more disposable mooks to get punched. "More" is not a character beat. But for this movie and for Revolutions, the Wachowskis seemed to be convinced that it was.

At the end of the day, none of that makes the first movie any less impressive. It's just a shame that for all the work that went into imitating bullet time or tinting things green, almost nobody ripped off the low-tech narrative choices that The Matrix made. Yuen Woo Ping went back to Hong Kong, and Hollywood pivoted to The Fast and the Furious a few years later.

But not to end on a completely down note, there is one person who I think actually got it, and that's Keanu Reeves himself. The John Wick movies certainly have glimmers of it, even if the fashion has swung from wuxia to MMA. And Reeves' directorial debut, Man of Tai Chi is practically an homage to the physicality of the movie that made him an action star. If there is, in fact, a plan to reboot The Matrix as a new franchise, I legitimately think they should put Neo himself in the director's chair. It might be the best way to capture that magic one more time.

February 13, 2019

Filed under: movies»commentary»scifi

Sunshine

Sunshine wasn't particularly loved when it was released in 2007, despite a packed cast and direction by Danny Boyle. In the years since, it has somehow stubbornly avoided cult status — before its time, maybe, or just too odd, as it swings wildly between hard sci-fi, psychological drama, survival horror, and eventually straight-up slasher flick by way of Apocalypse Now. But it's intensely watchable and, I would argue, underappreciated, especially in comparison to writer Alex Garland's follow-up attempts on the same themes.

"Our sun is dying," Cillian Murphy mutters at the start of the film, and the tone remains pretty grim from there. The spaceship Icarus II is sent on a desparate trip to restart the sun by tossing a giant cubic nuclear bomb into it — a desparate quest, made all the more desparate by the fact that nobody on the mission seems particularly stable or well-suited to the job. Boyle sketches out each crew member quickly but adeptly, giving each one a well-defined (if sometimes precious) persona, like the neurotic psychologist, the hot-tempered engineer, or the botanist who cares more for her oxygen-producing plants than the people onboard (or, viewers suspect, the mission itself). NASA would never put these people in a small space for more than a day, but they're a marvel of small-scale human conflict almost from the very start.

That approach to character is emblematic of Sunshine's construction, which is really less of a plot and more of a set of simple machines rigged in opposition to each other. An early miscalculation in the position of the ship's sun shield leads to a series of cascading crises, each of which provides both physical challenge as well as ratcheting tension among the crew from dwindling resources. Yet there's only one real plot twist in the whole thing: the murderous captain Pinbeck of Icarus I, driven mad by his own journey toward the sun. Everything else is established clearly and methodically, with ample recall and signposting — it's the rare science fiction movie that doesn't cheat. Even Pinbeck's slasher-esque rampage shows up in clues for savvy viewers, who can clock a missing scalpel and scattered bloody handprints on rewatch.

Similar to an obvious inspiration (and personal favorite), Alien, one of the film's greatest special effects is the cast. Boyle gets a lot of mileage out of Cillian Murphy's After Effects-blue eyes, but you can't go wrong with Chris Evans, Michelle Yeoh, Benedict Wong, and Rose Byrne. Still, for my money, Cliff Curtis is the film's MVP: as the doctor/psychologist Searle, he's both bomb-thrower and mediator in equal measures. His obsession with the sun leaves him visibly burned, like a Dorian Gray painting of the crew's mental health. And yet, unlike Pinbeck (who he clearly parallels), Curtis manages to keep his perspective straight and a wry sense of humor — he may love the light, but he's not blinded by it.

So why isn't Sunshine canonized, especially in a climate-change world where "our sun is dying" passes for optimism? Why is it considered a misfire, when Garland's flawed Annihilation was seen as a cult hit in the making? It's still not clear to me. Maybe it just got lost in the shuffle: 2007 was a good year for movies, including There Will Be Blood for the serious film aficianados and The Bourne Ultimatum or Death Proof for surprisingly well-crafted genre fans. Or maybe it's also just too close to its nearest relatives: too easy to write off as "Event Horizon without the schlocky fun" or "Solaris, but for stupid people." Either way, it feels overdue for reconsideration.

December 14, 2018

Filed under: journalism»ethics

Lightning Power

This post was originally written as a lightning talk for SRCCON:Power. And then I looked at the schedule, and realized they weren't hosting lightning talks, but I'd already written it and I like it. So here it is.

I want to talk to you today about election results and power.

In the last ten years, I've helped cover the results for three newsrooms at very different scales: CQ (high-profile subscribers), Seattle Times (local), and NPR (shout out to Miles and Aly). I say this not because I'm trying to show off or claim some kind of authority. I'm saying it because it means I'm culpable. I have sinned, and I will sin again, may God have mercy on my soul.

I used to enjoy elections a lot more. These days, I don't really look forward to them as a journalist. This is partly because the novelty has worn off. It's partly because I am now old, and 3am is way past my bedtime. But it is also in no small part because I'm really uncomfortable with the work itself.

Just before the midterms this year, Tom Scocca wrote a piece about the rise of tautocracy — meaning, rule by mulish adherence to the rules. Government for its own sake, not for a higher purpose. When a judge in Nebraska rules that disenfranchising Native American voters is clearly illegal, but will be permitted under regulations forbidding last-minute election changes — even though the purpose of that regulation is literally to prevent voter disenfranchisement — that's tautocracy. Having an easy election is more important than a fair one.

For those of you who have worked in diversity and inclusion, this may feel a little like the "civility" debate. That's not a coincidence.

I am concerned that when we cover elections with results pages and breaking alerts, we're more interested in the rules than we are in the intended purpose. It reduces the election to the barest essence — the score, like a football game — divorced from context or meaning. And we spend a tremendous amount of redundant resources across the industry trying to get those scores faster or flashier. We've actually optimized for tautocracy, because that's what we can measure, and you always optimize for your metrics.

But as the old saying goes, elections have consequences. Post-2016, even the most privileged and most jaded of us have to look around at a rising tide of white nationalism and ask, did we do anything to stop this? Worse, did we help? That's an uncomfortable question, particularly for those of us who have long believed (incorrectly, in my opinion) that "we just report the news."

Take another topic, one that you will be able to sell more easily to your mostly white, mostly male senior editors when you get back: Every story you run these days is a climate change story. Immigration, finance, business, politics both internal and domestic, health, weather: climate isn't just going to kill us all, it also affects nearly everything we report on. It's not just for the science stories in the B section anymore. Every beat is now the climate beat.

Where was climate in our election dashboard? Did anyone do a "balance of climate?"

How will electoral power be used? And against who?

Isn't that an election result?

What would it look like if we took the tremendous amount of duplicated effort spent on individual results pages, distributed across data teams and lonely coders around the country, and spent it on those kinds of questions instead?

The nice thing about a lightning talk is that I don't have time to give you any answers. Which is good, because I'm not smart enough to have any. All I know is that the way we're doing it isn't good enough. Let's do better.

Thank you.

[SPARSE, SKEPTICAL APPLAUSE]

November 10, 2018

Filed under: gaming»design»aaa

Loot Pack

If you wanted to look at the general direction of AAA game development, you could do worse than God of War and Horizon: Zero Dawn (concidentally, of course, the last two titles I finished on my usually-neglected PS4). They're both big-budget tentpole releases, with all the usual caveats that come with it: graphically rich and ridiculously detailed, including high-priced voice/acting talent, but not particularly innovative in terms of gameplay. But even within this space, it's interesting to see the ways they diverge — and the maybe-depressing tricks they share.

Of the two, Horizon (or, as Belle dubbed it for some reason, Halo: Dark Thirty) is the better game. In many ways, it's built on a simplified version of the A-life principles that powered Stalker and its sequels: creating interesting encounters by placing varied opponents in open, complex environments. The landscape is gorgeous and immense, with procedurally-generated vegetation and wildlife across a wide variety of terrain with a full day-night cycle. It's pretty and dynamic enough that you don't tend to notice how none of the robotic creatures you're fighting really pay attention to each other apart from warning about your presence — you can brainwash the odd critter into fighting on your side with a special skill, but otherwise almost everything on the map is gunning for you and you alone.

In fact, Horizon's reliance on procedural generation and systems is both its strong suit and its weak point. It's hard to imagine hand-crafting a game this big (Breath of the Wild notwithstanding), but there's a huge gulf between filling a landscape according to a set of gameplay rules and depicting realistic human behavior. One-on-one conversations and the camera work around them are shockingly clumsy compared to the actual (pretty good!) voice work or the canned animation sections. Most of the time, during these scenes, I was just pressing a button to get back to mutilating robot dinosaurs. But you can see where the money went in Horizon: lots for well-rendered undergrowth, not so much for staying out of the uncanny valley.

By contrast, God of War is really interested in its characters, as close-up as possible. The actual game is not good — the combat is cramped and difficult to read, ironically because of its love affair with a single-take camera (which is kind of a weirdly pointless gimmick in a video game — almost every FPS since Half-Life is already a single-take shot). It's small and short by comparison to open-world games, with its level hand-crafted around a linear story. I was shocked at how quickly it wrapped up.

But there's no awkward "crouching-animation followed by two-shot conversation tree" here: it may be a six-hour storyline, but it's beautifully motion-captured and animated. When Jeremy Davies is on-screen as Baldur, it's recognizably Jeremy Davies — not just in the facial resemblance, but in the way he moves and the little ticks he throws in. There's maybe five characters to speak of in the whole thing, but they come through as real performances (Sindri, the dwarven blacksmith with severe neat-freak tendencies, is one of my favorites). In retrospect, I may wish I had just watched a story supercut on YouTube, but there's no doubt that it's an expensive, expressive production.

Where both games share mechanics, unfortunately, is a common trick in AAA design these days: crafting and loot systems. Combat yields random, color-coded rewards similar to an MMO, and those rewards are then used alongside some sort of currency to unlock features, skills, or equipment. It extends the gameplay by putting your progress behind a certain number of hours grinding through the combat loops, as this is cheaper than actually creating new content at the level of richness and detail that HD games demand.

If your combat is boring (God of War), this begins to feel like punishment, especially if it's not particularly well-signposted that some enemies are just beyond your reach early in the game. It bothered me less in Horizon, where I actually enjoyed the core mechanical loop, but even there playability suffered: the most interesting parts of the game involve using a full set of equipment to manipulate encounters (or recover when they go wrong), but most of that toolkit is locked behind the crafting system to start. Instead of giving players more options and asking them to develop a versatile skillset from the start, it's just overwhelmingly lethal to them for the first third of its overall length (a common problem — it's tough to create a good skill gate when so much of the game is randomized).

Ironically, while AAA games have gone toward opaque, grind-heavy loot systems, indies these days have swung more toward roguelikes and Metroidvanias: intentionally lethal designs that marry a high skill ceiling with a very clear unlock progression. It may be a far cry from Nintendo's meticulous four-step teaching structure, but since indie developers aren't occupied with filling endless square miles of hand-crafted landscape, they've sidestepped the loot drop trap. Will the big titles learn from that, or from the "loot-lite" system that underlies Breath of the Wild's breakable weapons? I hope so. But the economics of HD assets seem hard to argue with, barring some kind of deeply disruptive new trend.

October 2, 2018

Filed under: tech»coding

Generators: the best JS feature you're not using

People love to talk about "beautiful" code. There was a book written about it! And I think it's mostly crap. The vast majority of programming is not beautiful. It's plumbing: moving values from one place to another in response to button presses. I love indoor plumbing, but I don't particularly want to frame it and hang it in my living room.

That's not to say, though, that there isn't pleasant code, or that we can't make the plumbing as nice as possible. A program is, in many ways, a diagram for a machine that is also the machine itself — when we talk about "clean" or "elegant" code, we mean cases where those two things dovetail neatly, as if you sketched an idea and it just happened to do the work as a side effect.

In the last couple of years, JavaScript updates have given us a lot of new tools for writing code that's more expressive. Some of these have seen wide adoption (arrow functions, template strings), and deservedly so. But if you ask me, one of the most elegant and interesting new features is generators, and they've seen little to no adoption. They're the best JavaScript syntax that you're not using.

To see why generators are neat, we need to know about iterators, which were added to the language in pieces over the years. An iterator is an object with a next() method, which you can call to get a new result with either a value or a "done" flag. Initially, this seems like a fairly silly convention — who wants to manually call a loop function over and over, unwrapping its values each time? — but the goal is actually to enable new syntax once the convention is in place. In this case, we get the generic for ... of loop, which hides all the next() and result.done checks behind a familiar-looking construct: for (var item of iterator) { // item comes from iterator and // loops until the "done" flag is set }

Designing iteration as a protocol of specific method/property names, similar to the way that Promises are signaled via the then() method, is something that's been used in languages like Python and Lua in the past. In fact, the new loop works very similarly to Python's iteration protocol, especially with the role of generators: while for ... of makes consuming iterators easier, generators make it easier to create them.

You create a generator by adding a * after the function keyword. Within the generator, you can ouput a value using the yield keyword. This is kind of like a return, but instead of exiting the function, it pauses it and allows it to resume the next time it's called. This is easier to understand with an example than it is in text: function* range(from, to) { while (from <= to) { yield from; from += 1; } } for (var num of range(3, 6)) { // logs 3, 4, 5, 6 console.log(num); }

Behind the scenes, the generator will create a function that, when called, produces an iterator. When the function reaches its end, it'll be "done" for the purposes of looping, but internally it can yield as many values as you want. The for ... of syntax will take care of calling next() for you, and the function starts up from where it was paused from the last yield.

Previously, in JavaScript, if we created a new collection object (like jQuery or D3 selections), we would probably have to add a method on it for doing iteration, like collection.forEach(). This new syntax means that instead of every collection creating its own looping method (that can't be interrupted and requires a new function scope), there's a standard construct that everyone can use. More importantly, you can use it to loop over abstract things that weren't previously loopable.

For example, let's take a problem that many data journalists deal with regularly: CSV. In order to read a CSV, you probably need to get a file line by line. It's possible to split the file and create a new array of strings, but what if we could just lazily request the lines in a loop? function* readLines(str) { var buffer = ""; for (var c of str) { if (c == "\n") { yield buffer; buffer = ""; } else { buffer += c; } } }

Reading input this way is much easier on memory, and it's much more expressive to think about looping through lines directly versus creating an array of strings. But what's really neat is that it also becomes composable. Let's say I wanted to read every other line from the first five lines (this is a weird use case, but go with it). I might write the following: function* take(x, list) { var i = 0; for (var item of list) { if (i == x) return; yield item; i++; } } function* everyOther(list) { var other = true; for (var item of list) { if (!other) continue; other = !other; yield item; } } // get my weird subset var lines = readLines(file); var firstFive = take(5, lines); var alternating = everyOther(firstFive); for (var value of alternating) { // ... }

Not only are these generators chained, they're also lazy: until I hit my loop, they do no work, and they'll only read as much as they need to (in this case, only the first five lines are read). To me, this makes generators a really nice way to write library code, and it's surprising that it's seen so little uptake in the community (especially compared to streams, which they largely supplant).

So much of programming is just looping in different forms: processing delimited data line by line, shading polygons by pixel fragment, updating sets of HTML elements. But by baking fundamental tools for creating easy loops into the language, it's now easier to create pipelines of transformations that build on each other. It may still be plumbing, but you shouldn't have to think about it much — and that's as close to beautiful as most code needs to be.

September 24, 2018

Filed under: journalism»professional

The Best of Times

About two months ago, just before sneaking out the back door so that nobody in the newsroom would try to do one of those mortifying "everyone clap for the departing colleague" routines, I sent a good-bye e-mail to the Seattle Times newsroom. It read, in part:

I'm deeply grateful to Kathy Best, who made the Interactives team possible in 2014. Kathy doesn't, I think, get enough credit for our digital operation. She was always the first to downplay her expertise in that sphere, not entirely without reason. Yet it is hard to imagine The Seattle Times taking a risk like that anymore: hiring two expensive troublemakers with incomprehensible, oddball resumes for a brand-new team and letting them run wild over the web site.

It was a gamble, but one with a real vision, and in this case it paid off. I'm proud of what we managed to accomplish in my four years here on the Interactives team. I'm proud of the people that the team trained and sent out as ambassadors to other newsrooms, so that our name rings out across the country. And I'm proud of the tools we built and the stories we told.

When I first really got serious about data journalism, the team to beat (ironically enough, now that I've moved to the Windy City) was the Chicago Tribune. It wasn't just that they did good work, and formalized a lot of the practices that I drew on at the Times. It was also that they made good people: ex-Trib folks are all over this industry now, not to mention a similar impact from the NPR visuals team that formed around many of the same leaders a few years later. I wanted to do something similar in Seattle.

That's why there was no better compliment, to my ears, than when I would talk to colleagues at other newsrooms or organizations and hear things like "you've built a pretty impressive alumni network" or "the interns you've had are really something." There's no doubt we could have always done better, but in only four years we managed to build a reputation as a place that developed diverse, talented journalists. People who were on or affiliated with the team ended up at the LA Times, San Francisco Chronicle, Philadelphia Inquirer, New York Times, and Pro Publica. We punched above our weight.

I never made a secret of what I was trying to do, but I don't think it ever really took hold in the broader organizational culture. That's a shame: turnover was high at the Seattle Times in my last couple of years there, especially after the large batch of buyouts in early 2017. I still believe that a newsroom that sees departures as an essential tool for recruiting and enriching the industry talent pool would see real returns with just a few simple adjustments.

My principles on the team were not revolutionary, but I think they were effective. Here are a few of the lessons I learned:

  • Make sacrifices to reward high performers. Chances are your newsroom is understaffed and overworked, which makes it tempting to leave smart people in positions where they're effective but unchallenged. This is a mistake: if you won't give staff room to grow, they'll leave and go somewhere that will. It's worth taking a hit to efficiency in one place in order to keep good employees in the room. If that means cutting back on some of your grunt work — well, maybe your team shouldn't be doing that anyway.
  • Share with other newsrooms as much as possible. You don't get people excited about working for your paper by writing a great job description when a position opens up. You do it by making your work constantly available and valuable, so that they want to be a part of it before an opening even exists. And the best way to show them how great it is to work for you is to work with them first: share what you've learned, teach at conferences, open-source your libraries. Make them think "if that team is so helpful to me as an outsider..."
  • Spread credit widely and generously. As with the previous point, people want to work in places where they'll not only get to do cool work out in the open, they'll also be recognized for it. Ironically, many journalists from underrepresented backgrounds can be reluctant to self-promote as aggressively as white men, so use your power to raise their profile instead. It's also huge for retention: in budget cut season, newsroom managers often fall back on the old saw that "we're not here for the money." But we would do well to remember that it cuts both ways: if someone isn't working in a newsroom for the money, it needs to be rewarding in other ways, as publicly as possible.
  • Make every project a chance to learn something new. This one is a personal rule for me, but it's also an important part of running a team. A lot of our best work at the Times started as a general experiment with a new technology or storytelling technique, and was then polished up for another project. And it means your team is constantly growing, creating the opportunity for individuals to discover new niches they can claim for their own.
  • Pair experienced people and newcomers, and treat them both like experts. When any intern or junior developer came onto the Interactives team, their first couple of projects would done in tandem with me: we'd walk through the project, set up the data together, talk about our approach, and then code it as a team. It meant taking time out of my schedule, but it gave them valuable experience and meant I had a better feel for where their skills were. Ultimately, the team succeeds as a unit, not as individuals.
  • Be intentional and serious about inclusive hiring and coverage. It is perfectly legal to prioritize hiring people from underrepresented backgrounds, and it cannot be a secondary consideration for a struggling paper in a diverse urban area. Your audience really does notice who is doing the writing, and what they're allowed to write about. One thing that I saw at the Times, particularly in the phenomenal work of the video team, was that inclusive coverage would open up new story leads time and time again, as readers learned that they could trust certain reporters to cover them fairly and respectfully.

In retrospect, all of these practices seem common-sense to me — but based on the evidence, they're not. Or perhaps they are, but they're not prioritized: a newspaper in 2018 has tremendous inertia, and is under substantial pressure from inside and out. Transparent management can be difficult — to actively celebrate the people who leave and give away your hard work to the community is even harder. But it's the difference between being the kind of team that grinds people down, or polishes them to a shine. I hope we were the latter.

Past - Present - Future