this space intentionally left blank

January 14, 2010

Filed under: journalism»new_media

Your Scattered Congress 2.0

It's been a big week for CQ's vote studies, which measure the presidential support and party unity of each senator and representative on a series of key votes. Our editorial and research team finished up the results for President Obama's first year in office, leading to a pair of NPR stories based on that data, in addition to our own coverage, of course.

To accompany our stories, I built a new version of our vote study visualization, leveraging what I've learned since creating the original almost two years ago. It is, as you'd expect, faster and better-looking. But there are subtle improvements included as well, ones I hope will make this a solid base for our vote study multimedia during the Obama administration.

  • It's auto-bounded, with no hard-coded graph boundaries. If Rep. Walt Minnick, D-Idaho, or Sen. Susan Collins, R-Maine, decide to diverge from their party even more than they already do, I won't have to go in and mess with the graphing algorithms to keep them inside the plot.
  • The math is more precise, and now adapts to any viewport dimensions. If we want a version of this that fits in a smaller or larger space, it will handle that without distortion.
  • It's more informative. The tooltips and display mechanism have been beefed up so that more information is available at a glance, including the ability to see the name of individual members on mouseover if there's only one for that specific datapoint, and the option to see all members in a column for the distribution views.
  • The "Find Member" function has been beefed up and made considerably easier to use.
  • It combines my lovingly-crafted graphics and the data table view into one movie, instead of splitting them into two individually-loaded Flash objects. This, as much as anything else, has probably lowered the initialization time substantially, but it also puts all the information right at your fingertips with no need to scroll the page.

As I've said before, I'm extraordinarily proud of the work our vote study team does, and thrilled to be able to contribute to their online presence in this way. Check it out, and I'd love to hear your thoughts.

January 6, 2010

Filed under: journalism»new_media

Standard Eyes

As part of my new team leader position at CQ, I get to pick which technologies and platforms our multimedia team will use for its projects. This is less impressive than it sounds: for content management reasons, our team often has to work separately from the rest of the CQ.com publishing platform, so it's not like I get to decide the fate of the organization. In any case, today I want to talk about a particular aspect of the limited power I do have: the use of "web standards" in creating online journalism.

Almost nobody thinks of news organizations as online tech leaders, but we create a lot of content that regular people (i.e. not nerds) actually read and interact with. There's a strong push online for content creators (including media organizations) to employ standards--and by standards, what's usually meant is strict HTML (including the proposed new tags in HTML 5) instead of Flash. It's an approach with several advantages, including searchability (unlike Rupert Murdoch, I welcome search engines), mobile readiness (until Adobe gets their ARM plugin working), better text and mash-up capabilities, and better UI consistency. We generally start project planning with HTML/Javascript as a possible solution.

But it's wrong to think that we should avoid Flash for ideological reasons instead of jumping in the moment it becomes more convenient--and frankly, the "web standards" approach is often anything but convenient, particularly for interaction and rich graphics. Building good-looking UI components out of div tags or fighting with stylesheets is not my idea of a good time. And it's not just painful, it's much less productive compared to the rapid pace of development in Actionscript. I personally feel that the speed factor--the time it takes for me to write a complex, rich application--is something that web standards groups aren't spending enough time on, frankly. The <aside> tag won't help me create content faster, while making CSS behave in a sane and easily predictable fashion would, but there are working groups for the former and seemingly none for the latter.

(Advocates for these "semantic" tags, by the way, would do well to read Arika Okrent's In The Land of Invented Languages, particularly the parts about the "philosophical" conlangs, which attempted--and failed miserably--to create a logical, self-evident classification for all the concepts we express in our messy and meaning-overloaded "natural" languages. Sound familiar?)

HTML 5 proponents point to its new tags (such as <canvas> or <video>) as alternatives, an idea that should make even the most inexperienced Actionscript developer chuckle in cynical mirth. Canvas in particular is phenomenally unsuited to replace Flash's animation and interaction capabilities, as a single glance at the API tutorial should make clear. All drawing is done manually on every frame, transforms are awkward, and compositing is done in the most confusing possible manner. It's fine for simple graphs and charts, but I'd have to re-implement the equivalent of Actionscript's display list--its powerful, tree-based rendering engine--and its event dispatch libraries from scratch before canvas could be useful. Our team's time is too valuable to spend hacking around on that kind of low-level functionality instead of producing actual journalism. Not to mention the time it would take to replace Actionscript's enormous library of other utility code in the DOM (also known as the world's worst programming API).

Besides, the realpolitik of the web is that most of our readers are probably still on IE, and it has no current or planned support for canvas, much less audio and video tags. We're producing work for a mass audience--we can't afford to be purists, especially since more people have Flash Player installed than have a browser capable of high-performance JavaScript anyway. Flash is more consistent across browsers than supposedly "standard" code, as well. Ultimately, it's managed to do what Java never really managed, and what the browser has accomplished only with great difficulty: create a cross-platform application platform that people will actually use.

All of which to say that I just can't get worked up when people start ranting about killing off Flash and replacing it with "standards"-based design. As far as I'm concerned, Actionscript has become a de-facto standard for the web, one that anyone can leverage (the free Flex SDK and FlashDevelop IDE are a must-have combo). By all means, let's put pressure out there for less centralized and more open solutions, ones that aren't owned by a single corporate entity. But in the meantime, if we want to get things done, there are two options. We can shun Flash out of spite, in favor of solutions that require more work for less return. Or we can start telling news stories in interesting ways using this technology. I know which path my team is going to take.

October 27, 2009

Filed under: journalism»new_media

Measure of Truthiness

Being the hip young technologist that she is, Belle has one of those Palm Pre phones, which does something very cool: given login information for various social media accounts (Google, Facebook, etc), it collates and cross-link that information into the device's contact list. So a person's ID picture is the same as their Facebook profile image, and when they update their contact information online, it automatically changes on the phone. Handy--when it works.

My understanding is that most of the time it does, but sometimes Palm's system doesn't quite connect the dots, and then Belle has to go in and tell it that certain entries are, in fact, the same person. Frankly, I'm impressed that it works at all. It's an example of the kind of pattern recognition that people are very good at, and computers typically are not. I personally think we'll always have an edge, which makes me feel absurdly better, as if Skynet's assassin robots will never be able able to track down Sarah Connor or something.

In essence, what Palm has done is create a system for linking facts with a confidence threshold. And it's something I've been thinking about in relation to journalism, particularly after watching a presentation by the Sunlight Foundation on their data harvesting efforts during the age of data.gov, not to mention the work I've been doing lately on budget and economic indicators. There's a lot of information floating around (and more every day), but how can we coordinate it with confidence? And is it possible that the truth will get buried under its weight?

Larry Lessig, of all people, pessimistically pitched the latter earlier this month, in a New Republic essay titled "Against Transparency." Lessig ties together the open government movement, free content activists, and privacy advocates into what he calls the "salience" problem: extracting meaning in context from a soup of easily-manipulated facts, without swamping the audience in data or misinterpreting it for political gain. It's a familiar problem: I consider myself a journalist, but I spend pretty much my entire workday nowadays chin-deep in databases, figuring out how to present them to both our readers and our own editorial team for use. It is, in other words, the same confidence problem: how do we decide which bits of data are connected, and which are not?

Well, part of the answer is that you need journalists who are good subject experts. All the data in the world is meaningless unless you have someone who can interpret it. In fact, this is one of the main directions I see journalism exploring as newsrooms become more comfortable with technology. Assuming journalists can survive until that point, of course: being a deep subject expert is well and good, but it seems to be the first thing that gets cut these days when the newsroom profitability drops.

Second, as journalism and crowdsourcing become more comfortable with each other, I think we're going to have to start tagging information with a confidence rating: how sure are we that these bits of information are related? Data that's increasingly pulled from disparate--and unevenly vetted--sources will need to be identified by its reliability. I'd still like to be able to use it, but I should be able to adjust for "truthiness" and alert others about it.

But perhaps most importantly, this kind of debate really highlights how the open government movement needs to be not just about the amount of data, but also its degree of interoperability. This has really been driven home to me on the federal budget process: from what I can understand of this fantastically complicated accounting system, you can track funds from the top down (via the subcommittees), or from the bottom up (actual agency funding). But getting the numbers to meet in the middle is incredibly hard, due to the ways that money is tracked. Indeed, you can get the entire federal budget as a spreadsheet (it's something like 30,000 line items), but good luck decoding it into something understandable, much less following funding from year to year.

That's a problem for a journalist, but it's also a problem as a citizen. Without clean data, open government initiatives may be severely weakened. But contra Lessig, I don't think that makes them worthless. I think it creates an interesting problem to solve--one we can't just brute-force with computing power. Open government shouldn't just be about amount, but about quality. When both are high, I see a lot of great opportunities for future reporting.

October 2, 2009

Filed under: journalism»new_media

Undertow

My apologies for a slow week posting here--in addition to rewriting the site and learning a bit more about Android, you may have heard that there's been some excitement going around at CQ. It's been busy.

But we're not the only journalistic institution feeling a little shaken up. In the aftermath of the Google Wave invite frenzy, Mark Milian of the LA Times got a little overexcited. He lists some "wild ideas" they've had while testing the technology. And I am all for wild ideas, but I think he's missing the point. The problem in newsrooms isn't the lack of technology, it's that journalists don't use it.

Case in point: most of Milian's suggestions involve using Wave as a kind of glorified content management system--using it to log notes during collaborative stories, archiving interview recordings, or providing a better update mechanism. I absolutely understand why such a thing seems like a dream come true, because as far as I can tell most CMSs in the journalism world are appalling (often because they were geared toward print needs, and have been jury-rigged into double-duty online). But look realistically at what he's asking for: effectively, it's a wiki (albeit a very slick one) and a modern editorial policy. This isn't rocket science.

We've had the tools to do what Milian wants for years now. The problem, in my experience, has been getting reporters and editors to cooperate. They're an independent lot, and we still sometimes have trouble getting them to use our existing centralized, collaborative, versioned publishing toolchain, much less a complex and possibly overwhelming web app like Wave. Moreover, what's the real benefit? Will we get more readers with prettier correction notes? Will the fact-checking be more accurate if it's transmitted over AJAX? Can Wave halt the erosion of trust in American journalism? No? Then it's kind of a distraction from the real problem, as far as I'm concerned. I mean, I'd love it if all the reporters I work with knew their way around a data visualization. I'd like a pony, too. But at the end of the day, what matters is the journalism, not the tools that were used to create it.

Where Milian might have a point is in the centralization of Wave, with its integration of audio, video, and image assets. The catch is where it's centralized: with Google. I doubt many newsrooms are incredibly keen to trust reporter's notes, call recordings, and editing chatter entirely to a third-party, particularly one with which they already consider themselves at odds. There are real questions of liability, safety, and integrity to be considered here. Not to mention what happens if one of those interlinked services goes down (I'm looking at you, GMail). If we're headed for a griefer future (and I think we are), maybe it's wise not to leap headfirst into that cloud just yet.

So look: everything he's written is a fine idea. I agree that they'd be great options to have, and you'll never find me arguing against better content management. But the barrier to entry has never been that we lacked a Google Wave to make it happen--it's been an ideological resistance to the basic principles of new media publishing in newsrooms around the country. Until you change that, by convincing journalists of the value of community interaction/link traffic/transparency/multimedia, all the fancy innovations in the world won't make an impact.

August 11, 2009

Filed under: journalism»new_media

News Not Useless

Since web video is kind of a hobbyhorse for me, at least one coworker has sent me their reactions to the Washington Post's ill-advised "Mouthpiece Theater" videos. These were a series of "comedy" shorts centering on political reporters Dana Milbank and Chris Cillizza, culminating in a piece that recommended a brand of beer named "Mad Bitch" to Secretary of State Hillary Clinton. The Columbia Journalism Review has a decent overview here. CJR's Megan Garber also draws attention to an important point from the paper's ombudsman: the Post views this, and other web video, as an "experiment."

I wish I could say that this is uncharacteristic. But there's just something about new media that makes otherwise sane, respectable journalistic outlets ignore the infrastructure of fact-checking, editorial review, and reputational risk that they've built for their traditional output. Executive Editor Marcus Brauchli admits as much in his reply to the Center for New Words when he writes: "We did not have a good process in place for reviewing videos before they are published on our site, and we are correcting that." Obviously, the Post would never treat its print reporting with a similar lack of oversight, but when YouTube enters the picture, caution is apparently tossed to the winds. I don't know exactly what it is that causes this. But I do have some guesses.

  • Journalists think they're funny. They're not. I learned a lot from doing competitive speech in college, but one of the most important lessons I took from it was the realization that not everyone is equally funny. Indeed, competing against really good after-dinner speakers allows a person to rank precisely how not-funny they are, and I'm not high on the list. That's not to say that I don't use humor when I feel like it's useful--on the contrary, my highest-ranking speeches often incorporated jokes and wordplay--or that I can't be amusing company, but I learned very quickly the difference between telling a joke or two and actually being funny.

    In truth, reliable comedy takes a massive amount of work. The Onion staff says in interviews that they start each week with six to eight hundred headline ideas, which are eventually culled to the 15 or 20 strongest candidates before publication. With that much effort bent to the task, the Onion and the Daily Show simply make this look easy. Journalists attempting to ape them quickly find out that it's not. Mouthpiece Theater caused offense for a joke that went too far, but the first warning sign should have been that it wasn't particularly hilarious to begin with.

  • Lack of newsroom diversity is insulating. Strictly speaking, Milbank and Cillizza's video wasn't just sexist. There's also a degree of race- and class-based humor that's not merely unfunny, but is also uncomfortable to watch and entirely inappropriate for an outlet in the Post's position. It's my personal belief that these kinds of remarks are far more common when the environment and management suffer from a lack of diversity--say, a newsroom/editorial team that's mostly white, upper-class, and male. If there's a better argument for newsroom diversity than the environment that produced Mouthpiece Theater, I can't think of one.
  • Stars get a free pass. Alessandra Stanley's recent mistake-ridden obituary for Walter Cronkite was almost parodic in its scope: among other errors, she got the dates of the moon landing and the MLK assassination wrong. As James Rainey remarks in the LA Times, this is actually part of a larger trend at both the New York Times and other news organizations: the tendency to pamper "star" reporters when it comes time to fact-check. Unsurprisingly, however, those same stars (as in the Washington Post's case) are often the first chosen for new media ventures, in order to capitalize on their "brand." The result is that video starring those reporters--not to mention other multimedia--is not subjected to the same scrutiny it would get if it were made by a relative nobody.
  • Or would it? In addition to the unwillingness to criticize star employees, I suspect that many editors are afraid to bring a critical eye to bear on new media for fear of revealing their unfamiliarity with it. Nobody wants to look like an idiot. And within the journalism community, the stereotype of bloggers/web video creators as basement-dwelling nerds is still alive and well, so the perceived level of minimum quality is very low. As a result, the reputational warning alarm either doesn't go off, or is suppressed.
The answer to many of these problems, one which I think a lot of news organizations are struggling with (certainly something which is occupying my own time) is coming up with an editorial process for new media that's equivalent to the print process. In terms of video, for example, I argue for a four-stage process:
  1. An editorial meeting on the prospective topic before any footage is shot.
  2. A review meeting after the video capture stage, so the direction being taken is discussed and approved.
  3. A comprehensive check part-way through the editing process, to make sure that the footage and script doesn't have any problems, and to give feedback while changes can still be implemented relatively easily.
  4. A final approval stage before the video is released to the web. This is the last chance for top-level editorial staff to spike a video if it seems questionable.
This isn't unreasonable, I don't think. In fact, it's meant to be roughly analogous to the editorial process that takes place when a reporter wants to write a story for any of our publications. Would such a structure have caught the Post's embarrassing online gaffe? Maybe. But my sincere hope is that a real editorial process would play a more profound role: it should have stopped the entire excruciating series from being broadcast in the first place.

I'll close with a somewhat in-the-trenches observation: as print organizations have moved online, there's been a great deal of panic over the role that video and multimedia will play in relation to more familiar formats. Most of the time, this panic means there's no clear vision behind their use: are they for clowning around? For infodumps by talking heads? For reposting network footage to accompany articles? For aping the stilted, much-ridiculed delivery of the local TV news? You only have to look at the schizophrenic archives of most American media sites to realize that there's no real plan behind it (the unsurprising exception among the big names being the New York Times, which has a generally savvy new media team).

In elementary school, we learn to write about the five questions: who, what, when, where, and why. I think you can answer these in any medium--but I think that each format has its strengths. My guiding rule of thumb has been that video is best-suited toward answering the "who" and the "why"--the human angle, in other words. Who are these people? What are their motivations, and their reasoning? Video leverages the tools that we've evolved over millenia for reading faces and telling stories, in ways that would be very difficult to evoke objectively through text or an interactive graphic. In my opinion, as news organizations try to figure out where video fits into their lineup, that's the high-level discussion they should be having. In the meantime, they should probably leave the comedy to the experts.

August 3, 2009

Filed under: journalism»new_media

Generation Gap

Although it's the description I use professionally, I'm ambivalent about the term "new media." I worry that it implies a wider gap between print/broadcast and Internet-based journalism--when really, both are more similar than not. But then I see something like Ian Shapira's Washington Post op-ed, and I realize: sometimes, you have to spell these things out.

Shapira is very, very upset that a blog excerpted parts of his story, added commentary, and then linked to the original Post article. No, seriously: he spends 1,900 words complaining that The Internets Stole His Bucket.

My article was ripe fodder for the blogosphere's thrash-and-bash attitude: a profile of a Washington-based "business coach," Anne Loehr, who charges her early-Gen-X/Boomer clients anywhere from $500 to $2,500 to explain how the millennial generation (mostly people in their 20s and late teens) behaves in the workplace. Gawker's story featured several quotations from the coach and a client, and neatly distilled Loehr's biography -- information entirely plucked from my piece. I was flattered.

But when I told my editor, he wrote back: They stole your story. Where's your outrage, man?

They stole your story? That's a bit melodramatic, Anonymous Editor. They quoted chunks of it, summarized the rest with some snarky editorial commentary, and then linked both to the original article and its (badly-formatted) sidebar. In doing so, they drove a fair amount of traffic to the Post, something Shapira even admits:
Gawker was the second-biggest referrer of visitors to my story online. (No. 1 was the "Today's Papers" feature on Slate, which is owned by The Post.) Though some readers got their fill of Loehr and never clicked the link to my story, others found their way to my piece only by way of Gawker.

Even if I owe Nolan for a significant uptick in traffic, are those extra eyeballs helping The Post's bottom line?

A: Yes, since it's an ad-supported site. This has been another episode of short answers to stupid questions.

Shapira ends his piece with a weak plea for earlier credit and shorter excerpts, as if Gawker should just put up a link reading "Ian Shapira's Awesome Article at Washington Post" and leave it at that. But between the opening and closing paragraphs, he spends a significant amount of time blaming the Internet for killing journalism. He interviews a lawyer who's trying to get newspapers the ability to sue websites that excerpt their material, and who states "If you don't change the law to stop this, originators of news reports cannot survive." Yes, legislating success has worked out well for other industries, hasn't it?

There are a lot of reasons why the originators of news reports may be finding it hard to survive, but being quoted in a high-traffic blog like Gawker is not one of them. On the other hand, being the kind of news organization that spends nearly 2,000 words on this kind of whining probably isn't helping your case.

A little while back, one of my managers asked me to define "webby" as a sanity check after someone tried to use it in an excuse. It's not a word I'd personally use, I said, but I'd basically argue that it means doing three things: link to other people, make it easy for them to link to you, and take advantage of the format to adapt your voice. That's basically what "new media" means to me. You still do good journalism, but you realize that it's no longer published in a vacuum. I'm not sure why that's so hard for reporters and editors to understand. But by all means, guys, keep getting angry when people send traffic your way. Let's see how that works out for you.

July 8, 2009

Filed under: journalism»new_media

Your Scattered Congress, Continued

It's that time again: CQ has posted the newest version of its yearly vote studies, ranking legislators on party unity and presidential support. Again, this uses my Flash applets for presenting the tabular data, as well as a scatter/distribution graphing.

As far as interesting emergent storylines go, there's not a lot for me to say yet. From the visualization end, I added medians to a couple of the plots but otherwise did relatively little tweaking. The one notable change was an adjustment to the House unity algorithm, due to the score of Rep. Walt Minnick, D-Idaho (and to a lesser extent, Rep. Bobby Bright, D-Alabama). Minnick has a unity score of 40%, the lowest of the House Democrats. As a result, I had to widen the "window" for that graph, which previously had no member with a unity score less than 50%. This had already been done in the Senate, thanks to Sen. Olympia Snowe, R-Maine.

You may notice some artifacting in the graph so far, particularly on the Democratic presidential support distribution. According to the editors for this data, it's probably due to the low amount of votes tallied for 2009 so far, causing a "clumping" around a few support values. As we accumulate more data and update these numbers, a more natural distribution curve should emerge.

My remaining technical gripes with these graphs, which I haven't had time to correct, are the confusing method of listing members in distribution views and the odd scaling that's used to fit them all in. I suspect they can both be solved by reducing the pixel size in those modes far enough that a 1:1 ratio is reached--no overlapping of values within columns. And I think we're going to take it widescreen, to make that easier--realistically, the whole thing's due for a design overhaul anyway. But in the meantime, I think it continues to work reasonably well, and it's still one of my favorite projects here.

May 27, 2009

Filed under: journalism»new_media

SCOTUS Nom nom nom

Are you, like all of DC, enraptured by the Sotomayor nomination? Feel free to keep track of the confirmation process (and compare it to past nominees, both successful and not) using CQ's interactive Supreme Court nomination graphic.

I'll tell you what killed me on this one: fonts. Our new special projects team member used to be the print graphics reporter, and as such she wanted to use the print fonts, like Benton Gothic. Don't get me wrong, Benton Gothic is a really nice font. But embedding fonts in Flash--particularly via code--is not a fun process. To be blunt, it's clumsy and unreliable. Of course, if your computer has the font, Flash will often pick it up. So interactives with embeds have to be tested on a (separate) clean platform, for which I use one of my VMs. This reveals another frustration: text rendering in Flash is incredibly inconsistent across platforms.

Now, better people than I (read: people who actually care) have commented on the difference between text on Mac, Windows, and Linux. In general, Microsoft respects the pixel grid, while Apple mimics print. Linux, as usual, offers a range of choices that approximate (but don't exactly match) the other two. I should add that while lots of people complain about font rendering on Linux, in my experience it's not that the type engine is bad so much as the fonts themselves are awful. Microsoft has spent a lot of money on great-looking screen fonts, and Apple just licenses classic print fonts, neither of which is easy for free software to match.

Regardless, for whatever reason, Flash seems to piggyback on the host platform's font rendering for its text. This may seem odd, given Adobe's prominence in type-layout software, but I'm guess it's meant to be "cheap" in terms of runtime size and speed--both factors in Flash's success over Java as a multiplatform client. Now that they're dominant, though, I wish they'd spend a few kB on better font handling. When I look at my interactives on a different OS, the rendering changes don't just mean that it looks a little different, maybe blurrier or a bit more spidery, depending on your preference. Suddenly, fonts overflow their textfields, or dynamic layouts shift in undesired directions. If I wanted to fight with the text engine, I'd use HTML!

That's not even to discuss the outright bugs. In our scrolling map, there's a "tooltip" consisting of a floating TextField object. It follows the mouse and identifies specific districts, obviating the need to manage 435 labels at a variety of zoom depths. One day I got to spend an hour debugging why, for whatever reason, the tooltip was simply disappearing in Safari and IE. Turns out it was autosizing incorrectly--which kind of defeats the point of an "autosize" parameter.

Or how about this one: for no apparent reason, adding a TextField object to the display list of a sprite causes the bitmap filters to distort slightly. If you look closely at the Supreme Court graphic above, you'll notice that bars with text inside them (or next to them) are sometimes 1 pixel taller, seemingly because the GlowFilter being used to create an unscaled 1 pixel outline decided to be 1.5 pixels. The problem disappears if you zoom in. Why does this happen? Who knows?

Flash 10 includes some low-level improvements to text, which is a good start. But as far as I can tell, they primarily make working with fonts better within a single platform, and are aimed at people creating flexible layouts, like word-processing applications. People like me who work in graphic-intensive apps like data visualization and gaming are still probably out of luck.

April 21, 2009

Filed under: journalism»new_media

The Precision Hack

Yesterday, Jeff Atwood at Coding Horror linked to "Inside the Precision Hack", a blog entry describing the process by which 4chan hackers broke the Time 100 poll. The poll, which is meant to nominate the "world's most influential people," had practically no security built into the voting mechanism. The kids from notorious Internet sewer and discussion board 4chan were able to manipulate it to the point where they could spell out messages, acrostic-style, at the top of the list.

Since Coding Horror is a programming blog run by a guy who's relatively new to web programming, he mainly sees this as a funny way to make a point: look how easy it is to bypass security when it's incompetent! But there's a wider question that ought to be raised, which would be: is this level of competency (or lack thereof) actually uncommon in journalism? And as newspapers and other outlets increasingly work through "new media," will they do so securely? What are the risks if they don't? These are relatively simple questions, and ones of self-evident importance. But as journalism conducts its internal debate regarding "innovation" in reporting, they're not questions that I'm seeing asked as often as they perhaps should be.

So what did Time do wrong? Turns out that they made lots of basic mistakes. The voting was submitted in plaintext using URL variables, and you could request the page using a GET instead of a POST, so innocent people could be enlisted simply by embedding an iframe on an unrelated page. When it became clear that this was skewing the vote, Time added a verification parameter consisting of the URL and a secret code run through an MD5 hash. Unfortunately, it sounds like they left the secret code in the Flash file as a literal, which is pretty easy to extract with one of the many SWF decompilers out there. These are some pretty weak security measures--a low barrier to entry that made it easy for some relatively-unskilled hackers to precisely manipulate Time's poll.

I want to make it clear that I'm not bringing this up at Time's expense, as Atwood is (I like Coding Horror, but he's not exactly a crack security researcher). In fact, I sympathize with Time. Security is hard! And expensive! And if you're not used to thinking about it from the very beginning, you're going to screw it up.

But why did it happen? Here's my completely unsubstantiated hunch: they got caught trying to do more with less. News organizations these days are caught between two directives: cut costs, and simultaneously jump onto the Web 2.0 bandwagon. These goals are directly opposed to each other. You can't get the kinds of programmers that you need to keep up with Google/Yahoo/Microsoft for cheap. So what happens? Chances are, you take journalists that are a little technically inclined, give them a few books on Ruby on Rails, and ta-da! you've got an "innovation"* team. It's not a recipe for tight security.

It doesn't help that the buzz in newsrooms for years has basically been around "hybrid journalists" that are video producers/writers/programmers all at once. Now, I have some respect for that idea. I personally believe in being well-rounded. But it's not always realistic, and more importantly, some things are too important to be left to generalists. Security is one of those things. Not only can poor data security undermine your instititional reputation, but it can be dangerous for your reporting, as well.

Take note, for example, of this article from Poynter on data visualizations. Washington Post reporter Sarah Cohen explains how graphing data isn't just useful for external audiences, but it can also help reporters zero in on interesting stories, or eliminate stories that actually aren't newsworthy. In fact, she says, the internal usage is probably far greater than the amount that makes it to the web or to print. It's a great explanation of why data visualization is an actual reporting tool that gets lost in the fuss over Twitter and blogging ethics panels.

So newsroom data isn't only meant for public consumption. It's a real source for journalists, particularly in number-heavy beats like public policy or business. And that means that data needs to be trusted. As long as it's siloed away inside the building, that's probably fine. Once it's moved outside and exposed through any kind of API, measures need to be taken to ensure it isn't tampered with in any way. And if it's used for any kind of crowdsourcing (which, to be fair, I have advocated in the past), that goes double.

So am I saying we should back away from opening up our newsrooms to online audiences? Not at all. But we should understand the gravity of the situation first, making sure that resources have been expended commensurate with reputational risk. And let's be honest: while it's great that NPR and the New York Times are making neat API calls and interactive polls available to everyone, maybe that's simply not appropriate--or aligned with the newsroom's primary mission--at smaller organizations.

Journalism has to come first. That journalism has to be trustworthy, down to the data on which it relies. Think of it as an editorial bar that needs to be cleared: if you don't feel like your security is up to the task, perhaps caution is in order. On the other hand, if you can't justify security from the start (as Time clearly couldn't), what you're really saying is that your results don't really matter (Time's certainly shouldn't). In that case, is it really the best use of your time?

March 24, 2009

Filed under: journalism»new_media

The Hard Parts

It's funny, when I first started at CQ, I thought that the most difficult part of the job would be doing quality work on a small staff. Unlike the New York Times or the Washington Post, we can't throw fourteen people at a project, so it often takes longer than I'd like--especially factoring in CQ's well-deserved reputation for fact-checking and accuracy. It's true, lack of resources has been a sticking point at times, but what has surprised me is that it's not the hard part.

For example, today we're launching our new district results maps. Previously, maps like these had been done in a very clumsy manner--literally, each state was a frame on the Flash timeline, with manually-placed zoom areas linked to another frame for districts that are very small, like downtown New York City and parts of urban California. This was not only a poor design, but it was practically unmaintainable. This time, with the help of our graphics team, we started from a complete, Flash-native vector map. Then I designed a UI framework that would not only allow Google Maps-style dragging, but also programmatically auto-zooms to states and to small or oddly-shaped districts. The result is easier to navigate, looks better, and will be far more adaptable for representing other datasets--it was a piece of cake to take the original House results map and change it to display presidential results. I'm very proud of how it turned out, and feedback has been stellar.

Because we are a small shop, bottlenecks abound, and this map took two weeks from conception to finish. We still have work to do: searching and deep-linking are not yet enabled, and I want to refactor some of it out into a separate library. But still, this is the easy part: we were able to work completely within a known framework of Flash, Ruby, and internal XML. The hard part comes when we decide to place it on the actual website. CQ's publishing system, like many newspapers and magazines, is not geared toward interactive material. It doesn't understand it, and can't embed it--even embedding static images online is cumbersome, as for a long time CQ eschewed graphics in its print daily, and our web system is based on the print pipeline.

The result is a series of workarounds that we are still trying to streamline. Since we can't embed interactives, we end up storing them on other servers, which hurts our traffic numbers. Our search function also can't index items outside the standard text content management system, so we currently have to create pseudo-articles to link to them with metadata. And unless we are careful, such ad-hoc arrangements have a tendency to sprawl across directory structures and servers in such a way that they become impossible to manage in the future. I don't want to give the impression that we're standing still--CQ was a leader in early online journalism, and substantial upgrades to address these issues are in the making--but right now it's a real hassle.

If we do our jobs right, and I think for the most part we have, none of this is ever visible to our readers. But it's awkward and time-consuming from the newsroom's perspective, and we are certainly far from the only publication having these problems. To this day, I can't remember the last time I used a newspaper search box and got the results I wanted (Google finds them just fine, however). Perhaps this is why the industry has entered into such a state of hysteria over the Internet's corrosive influence, but it reflects a fundamental misunderstanding: these aren't indications that journalism fails to work online, but that print formatting fails. Does that seem obvious? Yeah, well: welcome to the conversation.

Future - Present - Past