Mile Zero :: Latest posts in /journalism/investigation

October 1, 2015

Shielded by the law

This weekend, The Seattle Times released our investigation into Washington's "evil intent" law, which makes it almost impossible to prosecute police officers for the use of deadly force: Shielded by law. This was a great project to work on, and definitely an issue I'm proud we could bring to a wider audience. The source code for it is available here.

This project is on a page outside of our CMS entirely to support the outstanding trailer video that our photo department put together, which plays at the end of the animation if your device supports inline video (sorry, iOS). We didn't want to jump directly into the video, because A) it has sound, and B) many readers might find its visuals disturbing. Using our common dot motif to pull people in first, and then give them a choice of watching the video, seemed like a nice strategy. The animation is all done in the DOM and is mostly done just by adding classes on a timer — the only real time that JS touches individual elements is to randomize the fade-in for the dots. Loading the page without JavaScript shows the final image of the animation, via a handy no-js class that the scripts remove before starting playback.

One of our interesting experiments in this story was the use of embedded quiz questions, asking people to test their preconceived notions of police shootings. Originally we intended to scatter these throughout the story to grab readers' attention, but a section on the numerical results of the investigation ended up spoiling the answers. Instead, we moved them to a solid block before that section, and it's been well-received. The interactive graphics were actually also a relatively last-minute addition: originally, we were just going to re-run the print graphics, but exposing all the data in a responsive way was just too useful to pass up.

Probably the most technically advanced part of the page is the audio transcript from the 1985 state senate hearing on the law. As the audio plays, the transcript auto-advances and highlights the current line. It also displays a photo of the speaker from the hearing, to help readers get an idea of the players involved. Clicking on the transcript scrubs the audio to the correct spot. We don't do a lot of audio work here, unfortunately, but I think having an interface that's friendly to readers and listeners alike is a really nice touch, and something I do want to take advantage of on future projects. We built it to generate the data from standard subtitle files, so it should be easy to revisit.

Lastly, one of the most important parts of the story is the least flashy: the table in the "by the numbers" section for deadly force rates by race/ethnicity. We had worked for a while with this information presented the same as the other trivia questions, via clickable dots, but found that the part we really wanted to stress (the relative rates of death proportional to the general population) didn't stand out as much as we wanted. We brainstormed through a few different alternate visualizations, including stacked bars and nested pie charts, but in the end it was just clearest to build a table.

Like Rodney Dangerfield, they may get no respect, but a well-designed table can often be the simplest, easiest way to get a point across. The question then is, what's a well-designed table? Personally, I think there's a whole post in that question — how you order the columns, effective sorting/filtering, and how to add extra features (embedded sparklines, detail expansion, and tree views) that add information without confusing readers. One day, maybe I'll write it. But in the meantime, if you're working on a similar project and can't quite figure out how to present your information, there's no shame in using a table if it serves the story.

12:57 x permalink

October 21, 2014

Loaded with lead

I'm very proud to say that "Loaded with lead," a Seattle Times investigation into the ways that gun ranges poison their customers and workers, went live this weekend. I worked on all four interactives for this project, as well as doing the header design and various special effects. We'll have a post up soon on the developer blog about those headers, but what I'd like to talk about today is one particular graphic — specifically, the string-of-pearls chart from part 2.

The data underlying the pearl chart is a set of almost 300 blood tests. These are not all tests taken by range workers in Washington, just the ones that had to be reported after exceeding the safe threshold of 10 micrograms per deciliter. Although we know who some of the tested workers are, most of them are identified only by an anonymous patient ID and the name of their employer. My first impulse was to simply toss the data into a scatter chart, but as is often the case, that first impulse proved ill-advised:

Although the tests are taken from a ten-year period, for any given employer/employee the tests tend to be more compact in timeframe, which makes it tough to look at any series without either losing the wider context, or making it impossible to see the individual tests.
There aren't that many workers, or even many ranges, with lots of test results. It's hard to draw a trend when the filtered dataset might be composed of only a handful of points. And since within a range the tests might be from one worker or from many, they can't really be meaningfully compared.
Since these tests are only of workers that exceeded the safe limit, even those trends that can be graphed do not tell a good story visually: they usually show a high exposure, followed by gradually lowered lead levels. The impression given is that gun ranges are becoming safer, but the truth is that workers with hazardous blood lead levels undergo treatment and may be removed from the high-lead environment, resulting in lowered test results but not necessarily a lead-free workplace. It's one of those graphs that's "technically correct," but is actually misleading.

Talking with reporters, what emerged was that the time dimension was not really important to this dataset. What was important was to show that there was a repeated pattern of negligence: that these ranges posted high numbers repeatedly, over long periods of time (in several cases, more than five years). Once we discard a strict time axis, a lot more interesting options open up to us for data visualization.

One way to handle this would be with a traditional box and whiskers plot, which shows the median and variation within a statistical set. Unfortunately, box plots are also wonky and weird-looking for most readers, who are not statisticians and would not know a quartile if it offered them a grilled cheese sandwich. So one prototype simplified the box plot down to its simplest form — probably too simple: I rendered a bar that began and ended within the total range of test results for each range, with individual test results marked with a line inside that bar.

This version of the plot was visually interesting, but it had flaws. It made it easy to see the general level of blood tests found at each range, and compare gun ranges against each other, but it didn't show concentration. Since a single tick mark was shown within the bar no matter how many test results at a given level, there was litttle visual difference between two employers with the same range of test results, even if one employer mainly showed results at the top of the range, and the other results were clustered at the bottom. We needed a way to show not only level, but also distribution, of results.

Given that the chart was already basically a number line, with a bar drawn from the lowest to the highest test result, I removed the bar and replaced the tick marks with circles that were sized to match the number of test results at each amount. Essentially, this is a histogram, but I liked the way that the circles overlapped to create "blobs" around areas of common test results. You can immediately see where most of the tests fall for each employer, but you don't lose sight of the overall picture (which in some cases, like the contractors working outside of a ventilation hood at Wade's, can be horrific — almost three times the amount considered dangerous by the CDC). I'm not aware of anyone else who's done this kind of chart before, but it seems too simple for me to be the first to think of it.

I'd like to take a moment here to observe that pretty much all data visualization comes down to translating information into a form that our visual systems are evolved to quickly understand. There's a great post on how that translation functions here, with illustrations that show where each arrangement sits on a spectrum of perceived accuracy and meaning. It's not rocket science, but I think it's a helpful perspective: I'm just trying to trick your visual cortex into absorbing a table's worth of data at a glance.

But what I've been trying to stress in the newsroom from this example is less technical, and more about how much effective digital journalism comes from the simple process of iteration and self-evaluation. We shouldn't expect to come up with a brilliant interactive on the first try every time, or even any of the time. I think the string-of-pearls is a great example of that, going from a visualization that I was confusing and overly-broad to a more focused graphic statement, thanks to a lot of evolution and brainstorming. It was exhausting work, but it's become my favorite of the four visualizations for this project, and I'm looking forward to tweaking it for future stories.

10:02 x permalink

Past - Present