It has been a busy week, but I wanted to take a moment to recognize my colleagues at The Seattle Times for their tremendous work, resulting in a 2015 Pulitzer Prize for breaking news journalism. Their coverage of the Oso landslide was clear, comprehensive, and accurate, and followup work continues to this day (including one of my first projects for the paper). It's very cool to be working in a newsroom that's the winner of 10 Pulitzer Prizes over the years, and I'm looking forward to being here when we win #11.
We have sent several people from the newsroom, including myself, to journalism conferences over the last few months. Most conferences are about 50% inspirational and 50% crap (tilted heavily crap-wards in the keynotes), but you meet good people and you get to see the nuts and bolts behind the scenes of some of the best interactive news stories published.
It's natural to come back from a conference with a kind of inferiority complex, and equally easy to conclude that we're not making similar rich presentations because we don't have the cool tools that those other (richer, more tech-savvy) newsrooms have. We too, according to this train of thought, need to be coding elaborate visualization generators and complicated new CMS features — or, as Ryan Pitts from Mozilla said to me last weekend at the Society for News Design workshop, "let's not rest until every paper in the country has built its own charting application."
I think better newsroom tech is important, but let's play devil's advocate for a bit with an unpopular hypothesis: developing tools for the editors and reporters at your newspaper is a waste of your time, and a distraction from the journalism you should be doing.
Why a waste of your time? Partly because newsroom tools get a lot less uptake than you probably think they do (certainly less than we'd hope they would). I've written a lot of internal applications in my time, and they've never been particularly popular, because most reporters and editors don't care. They're too busy doing journalism to use your solution (which is as it should be), and they are probably not big on technology anyway (I have a lot of reporters who can't use Excel, which pains me greatly). Creating tools for reporters is, most of the time, attacking the problem at the wrong point.
For many newsrooms, that wasted time will end up being twice as expensive, because development resources are scarce and UI is hard. Building a polished, feature-filled chart generator that the average journalist can use will take at least a couple of programmer months, which is time those developers aren't working on stories and visualizations that readers want. Are you willing to sacrifice that time, especially if you can't guarantee that it'll actually get used? That's a pretty big gamble, unless you have the resources of the New York Times. You're probably better off just going with an off-the-shelf package, or even finding a simpler solution.
I don't think it's a coincidence that, for all the noise people make about the new data journalism startups like Vox and FiveThirtyEight, 99% of their chart output does not come from a fancy tool or a complex interactive: they post JPEGs. And that's fine! No actual reader has ever complained about having to look at a picture of a graph instead of a souped-up vector rendering (in Vox's case, they're too busy complaining that the graph was stolen from someone else, but that's another story). JPEG is a perfectly decent solution when it comes to simply telling the story across the entire web platform — in fact, it's a great embodiment of "do the simplest thing that works," which has served me well as a guiding motto in life.
So, as a rule of thumb: don't build charting libraries. Don't build general-purpose databases. Don't build drag-and-drop slideshows. Leave these things to other people, who have time and energy to build them for a living. Does this mean you shouldn't create tools at all? No, but the target audience should be you, the news developer, and other semi-technical newsroom staff like the web producers. In other words, make technology for the people who will actually use it, and can handle something that's not polished to a mirror sheen.
I believe this is the big strength of web components, and one reason I'm so bullish on them at the Seattle Times. They're not glossy, end-user products, but they are a great balance between power and accessibility for people with a little technical skill, and they're very fast to build. If the day comes when we do choose to invest in a slicker newsroom app, we can leverage them anyway, the same way that the NYT's fancy chart designers are all based on the developer-oriented D3 library.
In the meanwhile, while I would consider an anti-tool stance a "strong opinion weakly held," I think there's a workable philosophy there. These days, I feel two concerns very strongly (outside of my normal news/editorial production, of course): how to get the newsroom to make use of our skills, and how to best use the limited developer resources we have. A "no tools" guideline is not an absolute rule, but it serves as a useful heuristic to weed out the kinds of projects that might otherwise take over our time.
The honest answer is "about five years of practice," but that's not the whole story (nor is it something I can take to a curriculum planning committee). I think there are two areas of growth that students need to be aware of, and that I'm planning on stressing for this quarter: tooling and functional programming.
Learning their way around this trio is going to be a huge challenge for my students, most of whom still live in a world where individual files are edited and sent to the browser as-is (possibly with a PHP include or two). They haven't built applications with RESTful routes, or written client-side code in a module system. SCC hasn't typically stressed those techniques, which is a shame.
I'm happy to be the person who forces students into the deep end, but I do want to make sure they have a good, structured experience. Throwing everything at students is a quick way to make sure that they get overwhelmed and give up (not a hypothetical scenario: the previous ITC 298 class had exactly that problem, and ended poorly). To ease them in, we'll try building the following sequence of exercises in our directed lab sessions:
The progression starts with Node, and then builds out gradually so that each step conceptually depends on a previous lesson. Along the way, students will learn a lot about how to structure an application across all three of these environments — which brings us to the second, and probably harder, focus of the class.
The hard part of writing for Node is that you must embrace some degree of functional programming: the continuation-passing style used in the core APIs makes it inescapable. But the great part of writing for Node (especially as the first section of the course) is that it's actually a fairly gentle ramp-up. Callback functions are not that far from event listeners, and the ubiquitous async library softens the difficulty of mapping an array functionally. Between the two, there's no shortage of practice, since there's literally no other way to write a Node program.
That's the other strategy behind the tooling sequence I've laid out. We'll start from Node, and then build toward increasingly complex functional constructs, like modules, constructors, and promises. By the time the class have finished their final projects, they should be old hands at callbacks and closures, which will serve them well in almost any language.
The specifics of this quarter are still a little bit in flux, and will likely remain so, since I think it's good to be flexible the first time teaching a class. But if you're interested in following along, feel free to check out the class repo, which contains the syllabus, supporting materials, and example code so far. Issues and pull requests are also welcome!
In the last couple of weeks, a few more of my Seattle Times projects have gone live — namely, the animated graph in this story about EB-5 visa growth, and the Seattle architecture quiz. Both use the FLIP animation technique I wrote about a few weeks ago, although it's much more elaborate in the EB-5 graph, which animates roughly 150 elements at 60fps on older mobile devices.
I saw a lot of shocked reactions when Nintendo announced it would be partnering with another company to make smartphone games. The company was quick to stress that it wouldn't be moving entirely to app stores controlled by third parties: these games will not be re-releases of existing titles, and Nintendo is still working on new dedicated console hardware for the next generation. You shouldn't expect New Super Mario on your phone anytime soon. Basically, their smartphone games will serve as ads for the "real" games.
Unlike a lot of people, I've never really rooted for Nintendo to become a software-only company. Other companies that make that jump often do so to their detriment — look at Sega, which lost a real creative spark when they got out of the hardware business — and it's even more true for Nintendo, which has always explored the physical aspects of gaming as much as the virtual. The playful design of the GameCube controller buttons, or the weirdness of a double-screened handheld, or the runaway popularity of Wii Sports, are the result of designers who are encouraged to hold strong opinions. A touchscreen, on the other hand, is a weak opinion — even no opinion, as it imitates (but never really emulates) physical controls like buttons or joysticks.
But here's the other thing: what Nintendo represents on dedicated handheld hardware, as much as wacky design chops, is a sustainable market. I play a lot of Android games, I own a Shield, I'm generally positive on the idea of microconsoles. Even given those facts, a lot of the games I play on the go are either emulators or console ports, because the app store model simply does not support development beyond a single mechanic or a few hours of gameplay. The race to the bottom, and the resulting crash of mobile game prices, means that you will almost never see a phone game with the kind of lifespan and complexity you'd get out of even the lamest Nintendo title (Yoshi Touch & Go aside).
I don't think everything Nintendo produces is golden, but they're reliable. People buy Nintendo games because you're pretty much guaranteed a polished, enjoyable experience, to the point where they can start with an expanded riff on a gimmick level and still end up with a solid gameplay hit. They're the Pixar of games. And as a result of that consistency, people will pay $40 for first-party Nintendo titles, largely sight-unseen. This creates a virtuous cycle: the revenue from a relatively-expensive gaming market lets them make the kind of games that justify that cost. It's almost impossible to imagine Nintendo being able to sustain the same halo in a $1-5 game market.
There's room for both experiences in the gaming ecosystem. Microsoft, Sony, and Steam will all provide big-budget, adult-oriented games. The app stores are overflowing with shorter, quirkier, free-to-play fare. Nintendo's niche is that they crossed those lines: oddball software for all ages that was polished to a mirror sheen. Luckily, even though observers seem convinced that Nintendo is doomed, the company itself seems well aware of where their value lies — and it's not on someone else's platform.
About nine months ago, I made the first check-in on the Seattle Times news app template. Since that time, it's been at the heart of pretty much everything we've done at the Times, ranging from big investigative projects to Super Bowl coverage to dog name analysis. We've adapted it to form the basis of our web component stack, and made a version that automates Leaflet map creation. It's been a pretty great tool, used by news apps developers, producers, and graphics team members alike.
That said, I think in digital journalism we often talk in glowing terms about our tools, but we don't nearly as often discuss the downsides they possess. So let's be honest with ourselves: I love this scaffolding, but it's not perfect. It has issues. And I think those issues say interesting things about not only the template itself, but also newsroom culture, and the challenges of creating tools that can operate there.
What are the common threads here? While you could point to the static page approach as being part of the issue, I actually think what causes a lot of these problems is that the intended audience for the news app template is both broad and narrow. It's broad in that its users range from novice journalists to experienced developers (and, indirectly, non-technical editors and reporters feeding data into Google Sheets). It's narrow in that the actual production still requires a high level of technical comfort: familiarity with the command line, new kinds of tooling, and some ability to roll with unexpected bugs.
This is a tough, and self-contradictory, audience for a visualization toolkit. It's not, however, out of character for a general-purpose dev framework. And indeed, when we talk about app scaffolds from any news organization (not just The Seattle Times), that's what they are. They're written to be fast, to be portable, and to generate static files, because those are our priorities as deadline-driven journalists. They are also the far end of a range of newsroom tools, where news apps are at one end and pre-built widgets live on the other. I'm not really worried about where the template lives on that range, and I'm certainly not planning on reducing the complexity — I think it's at a sweet spot right now. But I do worry about the ways that it (and our CMS) fit into newsroom culture.
At the Times, like in many newsrooms, the online presence is largely run by "producers," who curate the stories on the home page and handle the print-to-digital transition process (it's not the same as a "producer" in software development). This process is complicated and highly-skilled, because news CMS systems are generally terrible. The web production staff also often work on projects that would, in print, fall under page design: building complex HTML presentations for special stories. This isn't because they're trained designers: producers are often younger, and while it's not entry-level work, it's close. They end up doing this work because trained, HTML-fluent designers are rare, and because nobody else in the newsroom bothers to learn web design.
As a result, we end up in a funny situation: the only people in the newsroom who really understand the web are the producers. Editors and reporters are discouraged from becoming more technically savvy because the workflow is print-first, and the CMS is so intimidating. Meanwhile, producers rarely become editors or reporters because the newsroom can't afford to lose their skills. There's a tremendous gap in newsroom culture between people who produce the content, and people who actually understand the medium in which that content is consumed. While the tooling is not entirely responsible for that, it is a contributing factor.
I think the challenge we face, as newsroom developers, is to be always aware and vigilant of that gap and its causes. Tools like the news app template are important, because they speed up our work, and the work of other technical people. But they don't mitigate the need for better, web-first publishing systems — something that can help diffuse web thinking from a producer-only skill to something that's available throughout the newsroom.
It's an accepted truth on the web that fast pages are better for users — people stay on them longer, follow more links from them, and generally report being happier with them. I think a lot about performance on my projects, because I want readers to be thinking about the story, not distracted by slow load times.
It's possible that I've been more aware of it, just because I've been working on a project that involves smoothly animating a chart using regular HTML instead of canvas, but it seems like it's been a bad month for that kind of thing. First Peter-Paul Koch wrote a diatribe about client-side templating, insisting that it's a needless performance hit. Then Flipboard wrote about discarding traditional elements entirely, instead rendering everything to a canvas tag in pursuit of 60 frames/second animations. Ironically, you'll notice that these are radically different approaches that both claim they create a better experience.
Instead of just sighing while the usual native app advocates use these posts to bash the web, and given that I am working on a page where high-performance mobile animations are a key part, I thought it'd be nice to talk about some experiments I've run with the approaches found in both. There are a lot of places where the web platform needs help competing on mobile, no doubt. But I'd prefer we talk about actual performance problems, and not get sidetracked into chasing down scattered criticisms without evidence.
At the other extreme is Flipboard's experiment with canvas rendering. Instead of putting everything in the document, like normal websites, they put a full-screen canvas image up and render everything — text, images, animations, etc. — manually to that buffer. You can try a demo out on your device here. On my Nexus 5, which is a reasonably new device running the latest version of Chrome, it's noticeably choppy. My experience with canvas is that Chrome's implementation is actually much faster than Safari, so I don't expect it to be smooth on iOS either (they've blacklisted tablets, so I can't be sure).
In order to get this "fluid" experience, here's what the Flipboard team threw away:
Again, I'm not claiming that my use case is a perfect analogue. I'm animating a graphic in response to a single button press, and they're attempting to create an "infinite scroll" (sort of — it's not really a scroll so much as an animated pager). But this idea that "the DOM is lava" and touching it will cause your reader's phones to instantly burst into flames of scorn seems patently ridiculous, especially when we look back at that list of everything that was sacrificed in the single-minded pursuit of speed.
Performance is important, and I care deeply and obsessively about it. As a gamer and a graphics nerd, I love tweaking out those last few frames per second, or adding flashy effects to a page. But it's not the most important thing. It's not more important than making your content available to the blind or visually impaired. It's not more important than providing standard UI actions like copy-and-paste or "open in new tab." And it's not more important than providing a fallback for older and less-powerful devices, the kind that are used by poor readers. Let's keep speed in perspective on the web, and not get so caught up in dogma that we abandon useful techniques like client-side templating and the DOM.
This week, if you want to be horrified by our grim meathook future, check out these posts from Seattle Times news librarian Gene Balk on vaccination rates at Washington State schools. There's a searchable data table and a map, but I'll spoil it for you: a large proportion of parents should probably pack surgical masks and antibiotics with their kids' lunches, because herd immunity is basically a thing of the past.
This kind of database-driven reporting is a staple of Gene's "FYI Guy" blog, and readers seem to enjoy it. Done right, it can help flesh out local coverage in interesting ways, explore topics that are off the beaten path, and find connections that we might otherwise miss. That said, I don't think you can stress enough how much of that depends on the quality of the reporter: Gene is a great researcher, and not everyone has his skills and experience.
By coincidence, yesterday Melissa Bell at Vox announced that they're (re)entering the field of data journalism in a almost parodically-titled post. I'm a little confused about the timing, since I thought data journalism was a part of their whole raison d'etre, but maybe I'm confusing them with a different scrappy, SEO-oriented news startup. Regardless, welcome to the party! After name-checking Philip Meyer's Precision Journalism, Bell adds a list of nine basic guidelines they plan to use. It's not a bad list, although several items are inoffensively bland (has anyone ever aspired to produce content that isn't "relevant and useful?").
- Vox will work to provide the most relevant and useful data behind the news, when you need it, in ways that help you understand the stories that matter most.
- We will work to make all the data behind our stories available to you to download and play with for yourself.
- We want you to improve on what we’ve done, to play with the data, visualize it, and help us analyze it — and make our work better.
- We will prioritize building data sets that can feed many stories, rather than focusing on one-off projects.
- Our data visualizations will be clear, concise, and deep — to help you understand our editorial better. They will adhere to design rules which ensure their accuracy and transparency.
- In the event we make a mistake (they do happen), we will swiftly and clearly clarify, correct, and communicate that as transparently as we can.
- We will curate and showcase the best data infographics and visualizations on the web.
- Visualizations we produce in-house will work well on as many platforms as possible: if you view it on a smartphone, it will function as well as it does on web.
- We will curate and publish the best content that our community of readers produces. Our data journalism is as much about you, the community, as it is about us: this is a partnership.
Some of these goals are particularly strong, and we share them at the Seattle Times. Take #2, for example: not only do I think it's important that we publish the data on which our visualizations are built whenever possible, but we also open-source our graphics so that people can see the methodology we used. It's also just good sense to be mobile-friendly (#8), although I personally believe that there are some times when a story simply can't be fully told on a 4" screen.
I'm less sure about curation, either from readers (#9) or around the web(#7), particularly in conjunction with accuracy and corrections (#6). One of the strengths of a newsroom is supposed to be fact-checking, but it's not clear to me what the process is for verification of third-party visualizations, or if Vox plans to do so at all (it hasn't been evident to me as a reader that they do it now). Which is too bad, because I think a kind of real-time "Snopes for bad reporting" is a site I'd definitely support.
But I'm really most skeptical of #4, which Bell elsewhere refers to as "finding, cleaning, and setting up data streams so that they can be the source for repeated stories." It's not that I think it's necessarily a stupid idea. I'm just not sure that it's effective, based on my experience. Data stories are just reporting. Data streams are reporting on top of engineering on top of reporting.
CQ's Economy Tracker, for example, was my team's attempt at a reusable data API, but it turned out to be a frustrating experience to keep it topped off with up-to-date content, the architecture was a hard problem to solve, and the number of stories we pulled out of it probably didn't justify the effort. It turns out that it's hard to find a data set that can actually support a series of articles.
(You may say, at this point, hang on a minute: wasn't Congressional Quarterly an example of exactly what we're talking about? It's a large, data-oriented news organization that sold access to data streams, and maintained datasets that were used to build stories and interactives via the multimedia team. Which is true, but it elides a number of factors: CQ was a single-purpose news site — congress and legislation only — with a huge number of reporters feeding the beast and a large technical staff to tend to it. Vox does not have those advantages, since it's a general-audience, international news site with a much smaller staff.)
More importantly, a "data stream," like an API, demands maintenance which quickly becomes a drag on the amount of time that can be spent on efforts outside those streams. That's doubly true if you make them public, and people start relying on them. Will will Vox sunset these data streams, if they stop being useful internally? What are the cutoff criteria? How will they let people know before the source is shut down? Most importantly, how much time will be taken away from reporting to maintain the data products?
When I joined at the Seattle Times, I made a pitch to editors that was a little different: instead of designing long-running services, we generally build news apps that are scoped to a specific point in time. In other words, we make stories, the same as the rest of the newsroom does. And just as you wouldn't normally ask a reporter to go back and update all their old stories when new events happen, we don't maintain news apps more than a week or two after publication (barring, of course, normal corrections and serious bug-fixes). Our entire development stack, in fact, is based on this assumption — that's why we publish static files to S3 (which is cheap and easy), instead of running a Rails/Laravel/Node server (which is expensive and hard).
Maybe for Vox, this isn't a problem. After all, they're the people with the "poor man's Wikipedia" card stacks that they maintain for topics over many months, and the evergreen experiments. At the very least, though, it does highlight a very real distinction that goes (in my opinion) beyond "data journalism" and to the core of the digital news mission. Are we building general systems and tools to cover unique stories? Or are we optimizing for semi-predictable products built around APIs and data sources? I'm leaning toward the former because I think it's a better match for a messy, unpredictable, human world. But best of luck to Vox with the latter.
While we've still got plenty of interesting projects in the works, the Seahawks rampage into the Super Bowl has pretty much taken over the News Apps budget at the Seattle Times this week. As a result, we've got some interesting interactives you might have seen:
More to come, obviously, as the road to the Super Bowl continues! Or, as Marshawn Lynch would say, "Yeah."
Whitman is a simple sampler (womp womp) written for modern web browsers. Built in Angular, it uses the WebAudio API to load and play sound files via a basic groovebox interface. You can try a demo on GitHub Pages. I put Whitman together for my dad's elementary school classes, so it's pretty simple by design, but it was a good learning experience.
The WebAudio API is not the worst new interface I've ever seen in a browser, but it's pretty bad. Some of its problems are just weird: for example, audio nodes are one-shot, and have to be created with new each time that you want to play the sound, which seems like a great way to trigger garbage collection and cause stuttering. Loading audio data is also kind of obnoxious, but at least you only have to do it once. I really wanted to be able to save the audio files in local storage so that they'd persist between refreshes, but getting access to the buffer (at least from the console/debugger) was oddly difficult, and eventually I just gave up.
But parts of it are genuinely cool, too. The API is built around wiring together nodes as if they were synthesizer components — an oscillator might get hooked up to a low-pass filter, then sent through a gain node before being mixed into the audio context — which feels pleasantly flexible. I'd like to put together a chiptune tracker with it. Support is decent, too, with the mobile browsers I care about (Safari and Chrome) already having decent availability. IE support is on the way.
The most surprising thing about Whitman is that it ended up being entirely built on web tech. When I started the project, I expected to move it over to a Chrome App at some point (it'll be taught on Chromebooks). There are still some places where that would have been nice (file retention, better support for saving data and synchronization), but for the most part it wasn't necessary at all. Believe it or not, you can pretty much write a basic audio app completely on the web these days, which is amazing.
In the parts where there is friction, it feels like a strong argument in favor of the Extensible Web. Take saving files, for example: without a "File Writer" object, Whitman does it by creating a link with a download attribute, base64-encoding the file into the href, and then programmatically clicking it when the user goes to save. That's a pretty crappy solution, because browsers only expose data URIs to create files. We need something lower-level, that can ask for permission to write locally, outside of a sandbox (especially now that the File System API is dead in the water).