This weekend, The Seattle Times released our investigation into Washington's "evil intent" law, which makes it almost impossible to prosecute police officers for the use of deadly force: Shielded by law. This was a great project to work on, and definitely an issue I'm proud we could bring to a wider audience. The source code for it is available here.
One of our interesting experiments in this story was the use of embedded quiz questions, asking people to test their preconceived notions of police shootings. Originally we intended to scatter these throughout the story to grab readers' attention, but a section on the numerical results of the investigation ended up spoiling the answers. Instead, we moved them to a solid block before that section, and it's been well-received. The interactive graphics were actually also a relatively last-minute addition: originally, we were just going to re-run the print graphics, but exposing all the data in a responsive way was just too useful to pass up.
Probably the most technically advanced part of the page is the audio transcript from the 1985 state senate hearing on the law. As the audio plays, the transcript auto-advances and highlights the current line. It also displays a photo of the speaker from the hearing, to help readers get an idea of the players involved. Clicking on the transcript scrubs the audio to the correct spot. We don't do a lot of audio work here, unfortunately, but I think having an interface that's friendly to readers and listeners alike is a really nice touch, and something I do want to take advantage of on future projects. We built it to generate the data from standard subtitle files, so it should be easy to revisit.
Lastly, one of the most important parts of the story is the least flashy: the table in the "by the numbers" section for deadly force rates by race/ethnicity. We had worked for a while with this information presented the same as the other trivia questions, via clickable dots, but found that the part we really wanted to stress (the relative rates of death proportional to the general population) didn't stand out as much as we wanted. We brainstormed through a few different alternate visualizations, including stacked bars and nested pie charts, but in the end it was just clearest to build a table.
Like Rodney Dangerfield, they may get no respect, but a well-designed table can often be the simplest, easiest way to get a point across. The question then is, what's a well-designed table? Personally, I think there's a whole post in that question — how you order the columns, effective sorting/filtering, and how to add extra features (embedded sparklines, detail expansion, and tree views) that add information without confusing readers. One day, maybe I'll write it. But in the meantime, if you're working on a similar project and can't quite figure out how to present your information, there's no shame in using a table if it serves the story.
I'm very proud to say that "Loaded with lead," a Seattle Times investigation into the ways that gun ranges poison their customers and workers, went live this weekend. I worked on all four interactives for this project, as well as doing the header design and various special effects. We'll have a post up soon on the developer blog about those headers, but what I'd like to talk about today is one particular graphic — specifically, the string-of-pearls chart from part 2.
The data underlying the pearl chart is a set of almost 300 blood tests. These are not all tests taken by range workers in Washington, just the ones that had to be reported after exceeding the safe threshold of 10 micrograms per deciliter. Although we know who some of the tested workers are, most of them are identified only by an anonymous patient ID and the name of their employer. My first impulse was to simply toss the data into a scatter chart, but as is often the case, that first impulse proved ill-advised:
Talking with reporters, what emerged was that the time dimension was not really important to this dataset. What was important was to show that there was a repeated pattern of negligence: that these ranges posted high numbers repeatedly, over long periods of time (in several cases, more than five years). Once we discard a strict time axis, a lot more interesting options open up to us for data visualization.
One way to handle this would be with a traditional box and whiskers plot, which shows the median and variation within a statistical set. Unfortunately, box plots are also wonky and weird-looking for most readers, who are not statisticians and would not know a quartile if it offered them a grilled cheese sandwich. So one prototype simplified the box plot down to its simplest form — probably too simple: I rendered a bar that began and ended within the total range of test results for each range, with individual test results marked with a line inside that bar.
This version of the plot was visually interesting, but it had flaws. It made it easy to see the general level of blood tests found at each range, and compare gun ranges against each other, but it didn't show concentration. Since a single tick mark was shown within the bar no matter how many test results at a given level, there was litttle visual difference between two employers with the same range of test results, even if one employer mainly showed results at the top of the range, and the other results were clustered at the bottom. We needed a way to show not only level, but also distribution, of results.
Given that the chart was already basically a number line, with a bar drawn from the lowest to the highest test result, I removed the bar and replaced the tick marks with circles that were sized to match the number of test results at each amount. Essentially, this is a histogram, but I liked the way that the circles overlapped to create "blobs" around areas of common test results. You can immediately see where most of the tests fall for each employer, but you don't lose sight of the overall picture (which in some cases, like the contractors working outside of a ventilation hood at Wade's, can be horrific — almost three times the amount considered dangerous by the CDC). I'm not aware of anyone else who's done this kind of chart before, but it seems too simple for me to be the first to think of it.
I'd like to take a moment here to observe that pretty much all data visualization comes down to translating information into a form that our visual systems are evolved to quickly understand. There's a great post on how that translation functions here, with illustrations that show where each arrangement sits on a spectrum of perceived accuracy and meaning. It's not rocket science, but I think it's a helpful perspective: I'm just trying to trick your visual cortex into absorbing a table's worth of data at a glance.
But what I've been trying to stress in the newsroom from this example is less technical, and more about how much effective digital journalism comes from the simple process of iteration and self-evaluation. We shouldn't expect to come up with a brilliant interactive on the first try every time, or even any of the time. I think the string-of-pearls is a great example of that, going from a visualization that I was confusing and overly-broad to a more focused graphic statement, thanks to a lot of evolution and brainstorming. It was exhausting work, but it's become my favorite of the four visualizations for this project, and I'm looking forward to tweaking it for future stories.
Yesterday I attended the Knight-Batten awards for innovation in journalism with some of the other multimedia team members at CQ. There was some really interesting work being shown (such as Pro Publica's Change Tracker project), as well as some for which I remain skeptical (the concept of "printcasting," for example, seems deeply misguided to me).
One award-winner that did truly impress me was the Center for Public Integrity's investigative journalism into tobacco smuggling. Titled Tobacco Underground, CPI lays out the global implications of the illicit tobacco economy, including hazardous counterfeit cigarettes from China, contraband flooding out of Russia and Ukraine, and a billion-dollar black market in the US and Canada run by organized crime. Tobacco is even a major funding source for terrorists in Pakistan, Northern Ireland, and Columbia. CPI's piece is an astonishing look at something that I (as a non-smoker) and likely most others would never suspect was an international criminal enterprise worth billions of dollars. Check it out.
Politifact, by way of CQ Politics, finds that the National Journal's ranking of Obama as "most liberal" might be, just maybe, a little suspect due to methodological error.
This is not news, frankly. The idea that Obama is definitely and objectively the "most liberal"--in a Senate that includes self-described socialist Bernie Sanders--is ridiculous. And after the magazine described John Kerry in 2004 as "most liberal," call me paranoid, but I suspect there's an editorial trend or narrative in play here.
But it is also amusing to me that this comes by way of Politifact, which is the CQ/St. Petersburg Times "truth squad" or factchecking team. In an election year, these things pop up like roaches in a dirty-bomb strike zone. They are big fun for journalists and editors--examine speeches and commercials for semantic slips and distortions, then trot out a few paragraphs of dry prose explaining exactly how and why that statement is or is not "spin." And perhaps, in this political era of Nixonian parsing, we need that.
But I hate truthsquadding, as it's called around the newsroom. It is the worst kind of gotcha journalism, and I think the industry can do better.
The basic idea of these fact-check columns, as far as I'm concerned, is flawed. It's flawed because it's redundant: our job, as journalists, should be to tell the truth and explain the obscure--to comfort the afflicted and afflict the comfortable, as they say, and with emphasis on the latter. If reporters are doing their jobs, there should be no need for a "special" department devoted to catching inaccuracies, because it should already be happening in the regular coverage. The fact that such departments exist is a tacit admission that accuracy isn't a concern elsewhere. And since I happen to know that CQ and St. Pete both have hard-working and dedicated fact-checking and research teams that go over our coverage with a fine-toothed comb, I sometimes wonder why it is that we are acting like we don't.
But more importantly, truth-squadding is journalism that refuses to see the big picture. To some extent, on the left or the right, who cares if someone takes some liberties when bragging on themselves, or when denigrating their opponents? What would be more important would be to examine not the wording of their speeches, but the impacts and outcomes of their policies.
On the other hand, that would require a lot of work, and a lot of interviews with experts, and possibly passing the reporting work over to someone with relevant expertise instead of the house pundit. And if the op-ed pages are any indication, I'm not sure that the media as an industry is willing to take that step.
"Every organization has crazy people working for it. A legitimate, trustworthy organization will not put the crazy people out front, in a position to deal with the public."
This message brought to you by the Epoch Times and Falun Dafa "Party to End the Party" event this Sunday. I'm trying to figure out how to write it up now, hopefully it can be sold to someone or provide an interesting read here, but needless to say I'm deeply ambivalent about the whole thing. A lot of it had that LaRouche/anti-semite vibe that sends chills up my spine.