This weekend, The Seattle Times released our investigation into Washington's "evil intent" law, which makes it almost impossible to prosecute police officers for the use of deadly force: Shielded by law. This was a great project to work on, and definitely an issue I'm proud we could bring to a wider audience. The source code for it is available here.
One of our interesting experiments in this story was the use of embedded quiz questions, asking people to test their preconceived notions of police shootings. Originally we intended to scatter these throughout the story to grab readers' attention, but a section on the numerical results of the investigation ended up spoiling the answers. Instead, we moved them to a solid block before that section, and it's been well-received. The interactive graphics were actually also a relatively last-minute addition: originally, we were just going to re-run the print graphics, but exposing all the data in a responsive way was just too useful to pass up.
Probably the most technically advanced part of the page is the audio transcript from the 1985 state senate hearing on the law. As the audio plays, the transcript auto-advances and highlights the current line. It also displays a photo of the speaker from the hearing, to help readers get an idea of the players involved. Clicking on the transcript scrubs the audio to the correct spot. We don't do a lot of audio work here, unfortunately, but I think having an interface that's friendly to readers and listeners alike is a really nice touch, and something I do want to take advantage of on future projects. We built it to generate the data from standard subtitle files, so it should be easy to revisit.
Lastly, one of the most important parts of the story is the least flashy: the table in the "by the numbers" section for deadly force rates by race/ethnicity. We had worked for a while with this information presented the same as the other trivia questions, via clickable dots, but found that the part we really wanted to stress (the relative rates of death proportional to the general population) didn't stand out as much as we wanted. We brainstormed through a few different alternate visualizations, including stacked bars and nested pie charts, but in the end it was just clearest to build a table.
Like Rodney Dangerfield, they may get no respect, but a well-designed table can often be the simplest, easiest way to get a point across. The question then is, what's a well-designed table? Personally, I think there's a whole post in that question — how you order the columns, effective sorting/filtering, and how to add extra features (embedded sparklines, detail expansion, and tree views) that add information without confusing readers. One day, maybe I'll write it. But in the meantime, if you're working on a similar project and can't quite figure out how to present your information, there's no shame in using a table if it serves the story.
I'm very proud to say that "Loaded with lead," a Seattle Times investigation into the ways that gun ranges poison their customers and workers, went live this weekend. I worked on all four interactives for this project, as well as doing the header design and various special effects. We'll have a post up soon on the developer blog about those headers, but what I'd like to talk about today is one particular graphic — specifically, the string-of-pearls chart from part 2.
The data underlying the pearl chart is a set of almost 300 blood tests. These are not all tests taken by range workers in Washington, just the ones that had to be reported after exceeding the safe threshold of 10 micrograms per deciliter. Although we know who some of the tested workers are, most of them are identified only by an anonymous patient ID and the name of their employer. My first impulse was to simply toss the data into a scatter chart, but as is often the case, that first impulse proved ill-advised:
Talking with reporters, what emerged was that the time dimension was not really important to this dataset. What was important was to show that there was a repeated pattern of negligence: that these ranges posted high numbers repeatedly, over long periods of time (in several cases, more than five years). Once we discard a strict time axis, a lot more interesting options open up to us for data visualization.
One way to handle this would be with a traditional box and whiskers plot, which shows the median and variation within a statistical set. Unfortunately, box plots are also wonky and weird-looking for most readers, who are not statisticians and would not know a quartile if it offered them a grilled cheese sandwich. So one prototype simplified the box plot down to its simplest form — probably too simple: I rendered a bar that began and ended within the total range of test results for each range, with individual test results marked with a line inside that bar.
This version of the plot was visually interesting, but it had flaws. It made it easy to see the general level of blood tests found at each range, and compare gun ranges against each other, but it didn't show concentration. Since a single tick mark was shown within the bar no matter how many test results at a given level, there was litttle visual difference between two employers with the same range of test results, even if one employer mainly showed results at the top of the range, and the other results were clustered at the bottom. We needed a way to show not only level, but also distribution, of results.
Given that the chart was already basically a number line, with a bar drawn from the lowest to the highest test result, I removed the bar and replaced the tick marks with circles that were sized to match the number of test results at each amount. Essentially, this is a histogram, but I liked the way that the circles overlapped to create "blobs" around areas of common test results. You can immediately see where most of the tests fall for each employer, but you don't lose sight of the overall picture (which in some cases, like the contractors working outside of a ventilation hood at Wade's, can be horrific — almost three times the amount considered dangerous by the CDC). I'm not aware of anyone else who's done this kind of chart before, but it seems too simple for me to be the first to think of it.
I'd like to take a moment here to observe that pretty much all data visualization comes down to translating information into a form that our visual systems are evolved to quickly understand. There's a great post on how that translation functions here, with illustrations that show where each arrangement sits on a spectrum of perceived accuracy and meaning. It's not rocket science, but I think it's a helpful perspective: I'm just trying to trick your visual cortex into absorbing a table's worth of data at a glance.
But what I've been trying to stress in the newsroom from this example is less technical, and more about how much effective digital journalism comes from the simple process of iteration and self-evaluation. We shouldn't expect to come up with a brilliant interactive on the first try every time, or even any of the time. I think the string-of-pearls is a great example of that, going from a visualization that I was confusing and overly-broad to a more focused graphic statement, thanks to a lot of evolution and brainstorming. It was exhausting work, but it's become my favorite of the four visualizations for this project, and I'm looking forward to tweaking it for future stories.