We posted Quantity of Care, our investigation into operations at Seattle's Swedish-Cherry Hill hospital, on Friday. Unfortunately, I was sick most of the weekend, so I didn't get a chance to mention it before now.
I did the development for this piece, and also helped the reporters filter down the statewide data early on in their analysis. It's a pretty typical piece design-wise, although you'll notice that it re-uses the watercolor effect from the Elwha story I did a while back. I wanted to come up with something fresh for this, but the combination with Talia's artwork was just too fitting to resist (watch for when we go from adding color to removing it).
In the end, I'm just happy to have worked on something that feels like such an important, powerful investigative story.
2016 was a busy year for interactive projects at The Seattle Times. According to our (very informal) spreadsheet, we did about 72 projects this year, about half of which were standalone. That number surprised me: at the start of the year, it felt like we were off to a slow start, but the final total isn't markedly lower than 2015, and some of those pieces were ambitious.
The big surprise of the year was Under Our Skin, a video project started by four young women in the newsroom and done almost completely under the radar. The videos themselves examine a dozen charged terms, particularly in a Seattle context, and there's a lot of smart little choices that the team made in this, such as the clever commenting prompts and the decision not to identify the respondents inline (so as not to invite pre-judgement). The editing is also fantastic. I pitched in a little on the video code and page development, and I've been working on the standalone versions for classroom use.
Perhaps the most fun projects to work on this year were with reporter Lynda Mapes, who covers environmental issues and tribal affairs for the Times. The Elwha: Roaring back to life report was a follow-up on an earlier, prize-winning look at one of the world's biggest dam removal projects, and I wrote up a brief how-to on its distinctive watercolor effects and animations. Lynda and I also teamed up to do a story on controversial emergency construction for the Wolverine fire, which involved digging through 60GB of governmental geospatial data and then figuring out how to present it to the reader in a clear, accessible fashion. I ended up re-using that approach, pairing it with SVG illustrations, for our ST3 guide.
SVG was a big emphasis for this year, actually. We re-used print assets to create a fleet of Boeing planes for our 100-year retrospective, output a network graph from Gephi to create a map of women in Seattle's art scene, and built a little hex map for a story on DEA funding for marijuana eradication. I also ended up using it to create year-end page banners that "drew" themselves, using Jake Archibald's animation technique. We also released three minimalist libraries for working with SVG: Savage Camera, Savage Image, and Savage Query. They're probably not anything special, but they work around the sharp edges of the elements with a minimal code footprint.
Finally, like much of the rest of the newsroom, our team got smaller this year. My colleague Audrey is headed to the New York Times to be a graphics editor. It's a tremendous next step, and we're very proud of her. But it will leave us trying to figure out how to do the same quality of digital work when we're down one newsroom developer. The first person to say that we just need to "do more with less" gets shipped to a non-existent foreign bureau.
On election day, voters in the Seattle metro area will need to choose whether or not to approve Sound Transit 3, a $54 billion funding measure that would add miles of light rail and rapid bus transit to the city over the next 20 years. It's a big plan, and our metro editor at the paper wanted to give people a better understanding of what they'd be voting on. So I worked with Mike Lindblom, the Times' transportation reporter, and Kelly Shea in our graphics department to create this interactive guide (source code).
The centerpiece of the guide is the system map, which picks out 12 of the projects that will be funded by ST3. As you scroll through the piece, each project is highlighted, and it zooms to fill the viewport. These kinds of scrolling graphics have become more common in journalism, in part because of the limitations of phone screens. When we can't prompt users to click something with hover state and have limited visual real estate, it's useful to take advantage of the most natural verb they'll have at their fingertips: scroll. I'm not wild about this UI trend, but nobody has come up with a better method yet, and it's relatively easy to implement.
The key to making this map work is the use of SVG (scalable vector graphics). SVG is the unloved stepchild of browser images — badly-optimized and only widely supported in the last few years — but it has two important advantages. First, it's an export option in Illustrator, which means that our graphics team (who do most of their work in Illustrator) can generate print and interactive assets at the same time. It's much easier to teach them how to correctly add the metadata I need in a familiar tool than it is to teach the artists an entirely new workflow, especially on a small staff (plus, we can use existing assets in a new way, like this Boeing retrospective that repurposed an old print spread).
Unfortunately, while these are great points in SVG's favor, there's a reason I described it in my Cascadia talk as a "box full of spiders." The APIs for interacting with parts of the SVG hierarchy are old and finicky, and they don't inherit improvements that browsers make to other parts of the page. As a way of getting around those problems, I've written a couple of libraries to make common tasks easier: Savage Query is a jQuery-esque wrapper for finding and restyling elements, and Savage Camera makes it easy to zoom and pan around the image in animated sequences. After the election, I'm planning on releasing a third Savage library, based on our <svg-map> elements, for loading these images into the page asynchronously (see my writeup on Source for details).
The camera is what really makes this map work, and it's made possible by a fun property of SVG: the viewBox attribute, which defines the visible coordinate system of a given image. For example, here's an SVG image that draws a rectangle from [10,10] to [90,90] inside of a 100x100 viewbox:
If we want to zoom in or out, we don't need to change the position of every shape in the image. Instead, we can just set the viewbox to contain a different set of coordinates. Here's that same image, but now the visible area goes from [0,0] to [500,500]:
Savage Camera was written to make it easy to manipulate the viewbox in terms of shapes, not just raw coordinates. In the case of the ST3 guide, each project description shares an ID with a group of shapes. When the description scrolls into the window, I tell the camera to focus on that group, and it handles the animation. SVG isn't very well-optimized, so on mobile this zoom is choppier than I'd like. But it's still way easier than trying to write my own rendering engine for canvas, or using a slippy map library like Leaflet (which can only zoom to pre-determined levels).
This is not the first time that I've built something on this functionality: we've also used it for our Paper Hawks and for visualizing connections between Seattle's impressive women in the arts. But this is the most ambitious use so far, and a great chance to practice working closely with an illustrator as the print graphic was revised and updated. In the future, if we can polish this workflow, I think there's a lot of potential for us to do much more interesting cross-media illustrations.
It's been a busy few weeks, but I do at least have an article up on Source with an overview of my SRCCON session on creating more humane digital journalism.
This month, I'm teaching a class at the University of Washington on reporting with Python. This seems like an odd match for me, since I hardly ever work with Python, but I wanted to do a class that was more journalism-focused (as opposed to the front-end development that I normally teach) and teaching first-time programmers how to do data analysis in Node just isn't realistic. If you're interested in following along, the repository with the class materials is located here
I'm not the Times' data reporter, so I don't get to do this kind of analysis often, but I always really enjoy it when I do. The danger when planning a class on a fun topic is that it's easy to over-stuff the curriculum in my eagerness to cover the techniques that I think are particularly interesting. To fight that impulse, I typically make a list of material I want to cover, then cut it in half, then think about cutting it in half again. As a result, there's a lot of stuff that didn't make it in — SQL and web scraping primarily among them.
What's left, however, is a pretty solid base for reporters who are interested in starting to use code to generate and explore stories. Last week, we cleaned and searched 1,000 text files for a string, and this week we'll look at doing analysis on CSV files. In the final session, I'm planning on taking a deep dive into regular expressions: so much of reporting is based around interrogating text files, and the nice thing about an education in regex is that it will travel into almost any programming language (as well as being useful for many command line tools like grep or sed).
If I can get anything across in this class, I'm hoping to leave students with an understanding of just how big digital scale can be, and how important it is to have tools for handling it. I was talking one night with one of the Girl Develop It organizers, who works for a local analytics company. Whereas millions of rows of data is a pretty big deal for me, for her it's a couple of hours on a Saturday — she's working at a whole other order of magnitude. I wouldn't even know where to start.
Right now, most record requests and data dumps operate more at my scale. A list of all animal imports/exports in the US for the last ten years is about 7 million records, for example. That's approachable with Python, although you'd be better off learning some SQL for the heavy lifting, but it's past the point where Excel is useful, and it certainly couldn't be explored by hand. If you can't code, or you don't have access to someone who does, you can't write that story.
At some point, the leaks and government records that reporters pore over may grow to a larger kind of scale (leaks, certainly; government data has will be aggregated as long as there are privacy concerns). When that happens, reporters will have to develop the kinds of skills that I don't have. We already see hints of this in the tremendous tooling and coordination required for investigating the Panama papers. But in the meantime, I think it's tremendously important that students learn how to automate data at a basic level, and I'm really excited that this class will introduce them to it.
Judging by my peers, it's possible that I'm the only journalist in America who didn't absolutely love Spotlight. I thought it was a serviceable movie, but when it comes to this year's Best Picture award I still harbor a fantasy that there's an Oscar waiting in Valhalla, shiny and chrome, for Fury Road (or for Creed, if push came to shove).
But I'm not upset to see Spotlight win, either. The movie may have been underwhelming for me, but its subject deserves all the attention it gets (whether or not, as former NYT designer Khoi Vinh wonders, the Globe fully capitalizes on it). My only real concern is that soon it'll be mostly valuable as a historical document, with the kind of deep reporting that it portrays either dying or dead.
To recap: Spotlight centers on the Boston Globe's investigation into the Catholic Church's pedophilia scandals in the 1990s — and specifically, into how the church covered up for abusive priests by moving them around or assigning them to useless "rehabilitation" sessions. The paper not only proved the fact that the church was aware of the problem, but also demonstrated that it was far more common than anyone suspected. It's one of the most important, influential works of journalism in modern memory, done by a local newsroom.
It's also a story of successful data journalism, which I feel is often rare: while my industry niche likes to talk itself up, our track record is shorter than many of us like to admit. The data in question isn't complex — the team used spreadsheets and data entry, not scripting languages or visualizations — but it represents long hours of carefully entering, cleaning, and checking data to discover priests that were shuffled out of public view after reports of abuse. Matt Carroll, the team's "data geek," writes about that experience here, including notes on what he'd do differently now.
So it's very cool to see the film getting acclaim. At the same time, it's a love letter to an increasingly small part of the news industry. Investigative teams are rare these days, and many local papers don't have them anymore. We're lucky that we still have them at the Seattle Times — it's one of the things I really like about working there.
Why do investigative teams vanish? They're expensive, for one thing: a team may spend months, or even a year working on a story. They may need legal help to pursue evidence, or legal protection once a story is published. And investigative stories are not huge traffic winners, certainly not proportional to the effort they take. They're one of the things newsrooms do on principle, and when budget gets tight, those principles often start to look more negotiable than they used to.
In this void, there are still a few national publishers pursuing investigations, both among the startups (Buzzfeed, which partnered on our mobile home stories) and the non-profits (Pro Publica and the Marshall Project). I'm a big fan of the work they're doing. Still, they're spread thin trying to cover the whole country, or a particular topic, leaving a lot of shadows at the local level that could use a little sun.
It's nice to imagine that the success of Spotlight the movie will lead to a resurgence in funding for Spotlight the investigative department, and others like them. I suspect that's wishful thinking, though. In the end, that Oscar isn't going to pay for more reporters or editors. If even Hollywood glamor can't get reporters and editors funded, can anything?
The last thing I'd written about here was the paper's investigation into police shootings, so let's take this chance to wander through the rest of 2015.
In October, after a Seattle dentist shot Cecil the lion and made himself temporarily infamous, one of our reporters put in a records request for all historical animal imports into the USA. The resulting story involved querying through seven and a half million rows of data to find out what we import, and how Paul Allen's Initiative 1401 (which banned the resale of several species of animal trophies) would affect these imports (answer: hardly at all). We also got to do some fun visualizations for it.
In November, my teammate Audrey worked with the Seattle Sketcher to create a voiced history of Ravensdale, a boomtown destroyed after a mining accident. In general, audio slideshows aren't hugely successful online, but I think this one was a really pleasant experience, and analytics indicate that a lot of people listened to it.
Every year, during the Seahawks season, the paper does a series of "paper hawks" — foldable paper dolls for players on the team. The last one is blank, so people can put in their own faces. To make things interesting, I put together a paper hawk web app that could use a camera to take a picture of the reader, and do all the customization in the page (including changing skin tones and hair color), then print it out. This was interesting project in part because the API I used (getUserMedia) is restricted to HTTPS only in Chrome. To make it work, we moved all of our projects to secure domains, which was a great test case for encrypting additional content at the paper.
For MLK Day, my team revived the Seattle Times' tribute to the great man, which was originally published twenty years ago (and had been last updated in 2011). The new version is responsive and easier to update, so that each year we can add more information to it. It's fitting, of course, that the paper has a page just for Dr. King, since they were a major part of the campaign to rename King County in his honor back in 1995. It's pretty cool to keep that tradition going.
Finally, just this week, we published a Pacific NW Magazine story on modern dating, with an interactive "mini-documentary" that I built with our video team. Based on your answers, it generates a custom playlist from the interviews that we recorded. We were inspired by this great piece done by the Washington Post on "the N word." I really enjoyed putting the interactions and animation together, but honestly, most of the credit goes to our video team, and my work was just the window dressing.
These are just the major interactives, of course. All told, we built 84 projects of all sizes last year, not including various small pages built by the producers using our app template. That's a pretty good rate of production for a two-developer team. Here's to a busy 2016!
How do we level up data journalists? In a few months, we'll have a new digital/data intern at the Times, and so I've been asking myself this question quite a bit, especially in light of our team's efforts to recruit diverse candidates. There are a lot of students and young journalists out there with a little bit of training, but no idea where to go from there: how do we get them across the gap to where they're capable of working on a newsroom development team? There's a catch-22 at work here: it's especially tough for aspiring news devs to get a job without experience, but they can't get experience without the job.
One strategy I've often heard is that young people should attend industry conferences as a way to learn from experienced journalists and build connections. Myself, I'm skeptical of this. Conferences have never really been a part of my professional life. We didn't go to them at CQ, and I never got a chance to go to GDC when I worked in the game industry. After I was hired at the paper, I got to go to SND2015 and Write the Docs, and this year I'm heading to NICAR, SRCCON, and (possibly) CascadiaJS. It's possible I really hate myself.
Visiting conferences is rewarding, but it's also exhausting, expensive, and a huge time-sink. And while host organizations often work to mitigate that through scholarships and grants to disadvantaged communities, it's still a big ask for neophytes. Even if I weren't skeptical of the benefits conferences actually bring, I think it's hard to argue that we don't need better, more accessible solutions.
The way I see it, there are three things that you get out of a conference as a young person:
Of the three, the first is the hardest to duplicate, and yet it's the most crucial. Networks are powerful in this industry, and you can practically watch them develop before your eyes if you look closely: young people who catch a break early with the right people, and find themselves quickly elevated with opportunities to work on well-known teams, fill industry panels, and write insipid Nieman Lab think-pieces on the future of news. Then we all end up competing over hiring those same six people, which I don't really think is healthy.
Ironically, this is something I want to discuss with other newsrooms at the conferences this year, before I retreat into my Seattle cave for the rest of my natural life. But I'm also starting a personal initiative to make myself available for "remote mentorship," and asking other people to do so. If you're in news and would like to join, feel free to add yourself to the sheet, and I'll share it with students or other people who get in touch!
This morning, you can read my opinions (plus three other newsroom developers) on AMP, Google's proposed ultra-fast publishing format. I'm the most optimistic of the the four, even though I wouldn't say that I'm enthusiastic. I think it's an interesting format, and possibly a kick in the pants for the business side of the industry.
In the last question of the interview, I talk a little bit about how I don't think site performance is a topic of actual discussion for product managers at news organizations, and as a result speed is still not a priority for them. What I didn't get in, but wish I had, is that I'm not sure they're wrong about that. Certainly, performance is important and third party code has run rampant on mobile pages. But is that really what's killing us?
I think it's worth remembering that this whole conversation started, in part, because Facebook decided that they want to be a publisher. Of course, nobody with a firm grasp on reality would think that handing full control of all their content over to Facebook is a good idea, so Zuckerberg's posse needed to create an incentive. Instant Articles ensued: in a burst of publicity, Facebook announced that the web was "slow" (with a lot of highly suspect numbers quantifying that slowness) and proposing their publication system as a way to speed it up.
Since in general we like nothing more than talking about how awful our industry is, journalists leapt to join in: why yes, now that you mention it, look how slow our sites are! Clearly, that's the problem (and not, say, the fact that Facebook holds our referral traffic hostage). It's the same reaction the industry has every time Apple releases a new device — cue exhaustive (and exhausting) ruminations on how to create compelling smartwatch content. Yuck.
This is not the first time that Facebook has created panic around the open web in order to make its social racket seem more appealing. In 2011, Anil Dash wrote his infamous post Facebook is gaslighting the web, documenting their practice of putting scary warnings on outgoing links while privileging their (short-lived) "seamless sharing" program. I think we should be careful about accepting their premises, even when they seem to jibe with the larger conversations around web technology.
Which brings us back to the question: should we care that news sites are slow?
My thought is that from a technical side, we should obviously care. Everyone on the web cares about speed. It has a proven effect on things like purchases and on-site time. It's an important metric, and one we should absolutely take seriously. But from a product standpoint, is it the most important thing? No. It's a Product X, and Product X will not save journalism (that post is from 2010, and sure enough, I think I've linked to it once a year since). It's easier to pitch a silver bullet than to admit the harder truth: that the key to our success is putting out journalism that is good enough that people will pay for it, one way or another.
It's possible, unfortunately, that there is no general-audience journalism good enough to make people pay for it anymore. And in that case, we are all doomed, with the possible exception of the NYT and whatever hipster media startups can get Comcast to cough up $200 million in funding. So it goes. But if we're going to be doomed, I'd rather be honest about why that is. It's not because we're slow. It's not because the ads are horrible. It's because our readers didn't think what we put out was important enough to pay for. That's enough of a tragedy on it's own.
This weekend, The Seattle Times released our investigation into Washington's "evil intent" law, which makes it almost impossible to prosecute police officers for the use of deadly force: Shielded by law. This was a great project to work on, and definitely an issue I'm proud we could bring to a wider audience. The source code for it is available here.
One of our interesting experiments in this story was the use of embedded quiz questions, asking people to test their preconceived notions of police shootings. Originally we intended to scatter these throughout the story to grab readers' attention, but a section on the numerical results of the investigation ended up spoiling the answers. Instead, we moved them to a solid block before that section, and it's been well-received. The interactive graphics were actually also a relatively last-minute addition: originally, we were just going to re-run the print graphics, but exposing all the data in a responsive way was just too useful to pass up.
Probably the most technically advanced part of the page is the audio transcript from the 1985 state senate hearing on the law. As the audio plays, the transcript auto-advances and highlights the current line. It also displays a photo of the speaker from the hearing, to help readers get an idea of the players involved. Clicking on the transcript scrubs the audio to the correct spot. We don't do a lot of audio work here, unfortunately, but I think having an interface that's friendly to readers and listeners alike is a really nice touch, and something I do want to take advantage of on future projects. We built it to generate the data from standard subtitle files, so it should be easy to revisit.
Lastly, one of the most important parts of the story is the least flashy: the table in the "by the numbers" section for deadly force rates by race/ethnicity. We had worked for a while with this information presented the same as the other trivia questions, via clickable dots, but found that the part we really wanted to stress (the relative rates of death proportional to the general population) didn't stand out as much as we wanted. We brainstormed through a few different alternate visualizations, including stacked bars and nested pie charts, but in the end it was just clearest to build a table.
Like Rodney Dangerfield, they may get no respect, but a well-designed table can often be the simplest, easiest way to get a point across. The question then is, what's a well-designed table? Personally, I think there's a whole post in that question — how you order the columns, effective sorting/filtering, and how to add extra features (embedded sparklines, detail expansion, and tree views) that add information without confusing readers. One day, maybe I'll write it. But in the meantime, if you're working on a similar project and can't quite figure out how to present your information, there's no shame in using a table if it serves the story.