this space intentionally left blank

September 15, 2015

Filed under: journalism»industry

Value Ad

Welcome to the block party:

The math is even starker for smaller publications and individual bloggers, who rely more heavily on display advertising—and who have already been battered by shifts in the advertising market; some longtime professional bloggers, like Heather Armstrong, have given up writing their blogs full-time. The Awl's publisher Michael Macher told me that "the percentage of the network’s revenue that is blockable by adblocking technology hovers around seventy-five to eighty-five percent." Currently, readers use an ad blocker on around twenty-five percent of all pageviews. Nicole Cliffe, one of the founders of The Toast, said that "adblocker is brutal for us. And people always break out the 'Subscribers model! I donate twenty bucks a year!' thing but it doesn’t add up."

I'm finding myself thinking about adblocking a lot this week, and about publishing platforms. I spend a lot of time thinking about this in general, because I enjoy working for a Seattle newspaper and I would like it to still be here (in one form or another) fifteen years from now (at least), something which was never guaranteed but looks noticeably more tenuous these days. And the upcoming launch of easy, widespread mobile ad-block software is a big part of that.

Bad apples

You can't say that the ad industry has not done anything to deserve this, because of course they have. Online advertising has always been the place where incompetent programming and delusional management meet in a nexus of terrible. You're not a bad person if you work in ads, but you work for a bad business and in all seriousness I will help you go work somewhere else if you get in touch with me. Contact info is on the right.

The problems that advertising causes for web pages are well documented. Ads slow pages down. They're heavy and disruptive. They cause security risks and drive-by hacks. There is a strong argument that a lot of the (admittedly welcome) improvements in web programming technique comes from having to work around these issues: lazy-loading content, async scripts, module systems that can't be stomped by leaky ad code globals.

As a side note, in these discussions, one of the big elephants in the room is that Google (and Facebook, and Apple, and Twitter) are all ad companies. Which is true, but it's true in the way that we might say that insects are a good source of protein — you're still not going to sell me a grasshopper sandwich. Lumping Google in with the average fly-by-night agency may be technically correct, but anyone who has interacted with regular ad code will tell you that the two are miles apart. If Google were actually the people writing the ads you see on an average media site, we probably wouldn't be having this discussion.

Well, we might. Apple might still have decided to stick their thumb in Google's eye out of pure spite, because they're a nasty little gang of capitalists, and that's kind of what they do. But it doesn't matter, because the really smart people at Google aren't writing actual ads. They write very elegant, high-performing auction software that distributes other people's horrible, horrible code, thus undermining quite a bit of their moral high ground. It's a little hard to get mad at readers who want to run content blockers or Greasemonkey scripts or whatever. Of course you want to block these ads! Who wouldn't?

Disruption and its discontents

We have a bad habit in the news industry, which is that we have no faith in our ability to run a business, even though we speculate on it endlessly. Allison Hantschel has been writing posts like this for literally a decade now as a result. One word for the embrace of clear management-led self-sabotage is "trusting." Another word is "suckers."

Newsrooms are very good at grilling other organizations about their plans, and very bad at interrogating our own, in part because we're supposed to have a "wall" between the business and editorial sides of the enterprise. These days that wall is often porous, but the tradition is still there. So when the business half of a paper tells editors and reporters that running obnoxious ads are necessary, we don't often push back, even though we don't want to run them any more than readers want to see them.

This is an explanation, not an excuse. That said, it is inescapably true that the business models we chose, as an industry, are not proving to be as solid as they once were — and it is worth remembering that journalism really was (and in many cases, still is) wildly profitable. Craigslist killed off the classifieds, and content blockers will probably suck all the profit out of the banner-ad revenue stream. Ironically, the one strategy that's still surprisingly sound is printing the previous day's events into a complicated stack of folded paper and selling it for a buck or two. It's not a growth industry, but it seems to be relatively disruption-proof so far. Nobody seems very clear on how to take that model online, though, except by digitizing old people a la Kurzweil and counting on them to pay for content (probably a long shot).

The thing about Silicon Valley's lust for disruption is that, absent of any principles other than a libertarian belief in market power, it tends to just recentralize or recreate the pre-disruption problems. So instead of having a corrupt taxi bureacracy, now we have a corrupt Uber oligarchy, where half the cars you see in the app are fake and they're probably selling your ride history to data merchants in Russia for pennies on the dollar. You don't have to like the taxi system to think that this is kind of a bum deal. Similarly, you don't have to be a fan of advertising, or of advertising-supported journalism, to think that the inevitable outcomes of blocking display will range from bad to worse.

Personally, I think it's healthy to feel wildly uneasy with this entire dynamic, in which tech companies decide to target one bad actor and inflict collatoral damage on an entire industry with a nonchalant wave of their hand. I think it's normal to believe that publishers are getting what they deserve for decades of bad management, and still feel like wiping them out is overkill. It's reasonable to think we should have control over the experience as users, while also arguing that media companies need to pay the bills somehow. But then, I'm not exactly disinterested, myself.

Brought to you by everybody, and nobody

I have a post that's been incubating for about two months now, about riot grrrl and open source. I started thinking about it when I watched The Punk Singer, a shockingly-good documentary about Bikini Kill frontwoman Kathleen Hanna. And the story of the whole movement that she founded (along with a number of other influential women) is fascinating, because it's based on an entire ethic of self-publication and self-determination. They didn't like the commercial media that they had, so they made new media of their own and taught people how to do the same. To me, that's how open source should feel: undermining centralized power and giving the means of production back to the people.

But there's another way of looking at that, which is to say that riot grrrl zines never changed much of anything and the old open web got lost in the shuffle. We can romanticize both of them as much as we want, but at the end of the day they weren't capable of surviving against moneyed interests, and no amount of self-mythologizing is going to change that. That doesn't mean we should give up, but we need to be realistic about the gap between "should be" and "is," because we're in the middle of it now: readers should pay for journalism; they actively don't want to do so.

Our grim meathook media future

Here's one difficult truth: if you are a reporter, editor, or other news human in the year of our lord 2015, your fate is almost certainly on the web. The New York Times and the hot youth flavor of the day (Vox, Vice, Buzzfeed) may get invited into Instant Articles or Apple News, but everyone else is on their own. App-only publications have been tried, and failed, even with the force of Rupert Murdoch behind them. That leaves the web as the place where a diverse, free press can exist, especially once those print revenues finally dry up.

Here's another: the web is always going to grapple with hostile ads, because it's a platform built on remixing and embedding third-party content. The same things that let advertisers abuse your mobile connection also allow us to host comments via Disqus, or embed media from Twitter or Youtube, or create neat interactive features. Open platforms are messier, which is part of why they grow so effectively, and also why they have a hard time competing with closed, curated platforms. Nobody's going to make it easy for us.

Between those two difficult truths is a spectrum of uncomfortable options, ranging from paywalls to subscriptions to (most likely) bankruptcy. As Casey Johnston says in the Awl piece that opens this post, the likely outcome is the rapid eradication of many sites that currently scrape by on Doubleclick revenue. The small and the quirky are going to take the hit here, even if they're not so small: The Dissolve was shuttered earlier this year, despite a pretty impressive stable of contributors and support, and they won't be the last.

In the very long term, we all die alone. I hesitate to make any other predictions. But I suspect that the eventual fallout of these changes is the hollowing-out of the American media: two big national papers at the top; a horde of niche publications clinging, white-knuckled, to subsistence at the bottom; and not very much in the middle except the non-profits who have opted out of the entire rat race. That this arrangement parallels our national economic inequality is probably not a coincidence, but we're long past the point where anyone wants to hear a systemic critique. Will your favorite publication survive? It's time to spin the wheel and find out.

July 22, 2015

Filed under: journalism»industry

Covering letters

It's a low bar to clear, but I think I can honestly say that journalism has a better diversity record compared to tech. If there were a newsroom the size of Facebook, chances are high it would have hired more than 7 black people last year. But that doesn't mean we can't do better. And if we're going to talk about hiring in journalism, we need to talk about interns.

NPR's visuals team has decided to try making internships more diverse, by being transparent about their requirements. Basically, they want to be clear about the expectations around cover letters and interviewing, so that people from non-privileged backgrounds know to prepare for them. I know and like several members of the team there, so I'm going to give them the benefit of a doubt when they say that there's more to come, but as a diversity program this seems a bit thin.

Firstly, a post on a little-trafficked blog is not exactly a high-visibility broadcast (said post isn't linked from any of the open internship positions as far as I could tell). It's easy for people to miss. More importantly, if the team is finding that cover letters and interviews are excluding good candidates, maybe the point should be to change the way that those are evaluated (or drop them entirely). Perhaps cover letters are not a great criteria for picking interns, or the way you're looking at them is biased in some way.

My own thoughts on this are complicated, not least because I see the playing field being artificially manipulated from all sides. I'm always amazed when I teach workshops at UW and hear that students may be on their fourth or fifth internship. They're behaving rationally — a lot of journalism careers are founded on student internships — but it's still bizarre to think that the path to a newsroom job might require literally years of unpaid or low-paying labor. If nothing else, there are a lot of people for whom that's just not an option.

Perhaps this is why, as CJR noted in a just-published report, minority journalists aren't finding jobs at rates proportional to graduation. In fact, minorities who graduate with a degree in journalism were 17% less likely to find a print journalism job compared to their white counterparts, compared to only 2% difference in advertising. As Alex Williams states:

Overall, only 49 percent of minority graduates that specialized in print or broadcasting found a full-time job, compared to 66 percent of white graduates. These staggering job placement figures help explain the low number of minority journalists. The number of minorities graduating from journalism programs and applying for jobs doesn’t seem to be the problem after all. The problem is that these candidates are not being hired.

I think the lessons from this are two-fold. First, I think we should be better about spreading internships out to a wider range of students. That's partly about selecting more diverse candidates, but it's also about turning down interns many-times-over in favor of candidates who need more of a boost. Internships are about experience, but they're also a way of pre-selecting who we want in the newsrooms of the future by burnishing their resumes. It's great to see NPR taking some responsibility, small or not, for their role in the pipeline. Hopefully other organizations will follow suit.

Additionally, maybe we should be less interested in internships as hiring criteria in the first place. Although my corner of the field is a little atypical, many of the best digital journalists I know didn't enter the field through a traditional career path (myself included). If our goal is to diversify our newsrooms, being accepting of a variety of different backgrounds and experiences is part of how we get there. So a candidate didn't have an internship. So what? Can they write? Can they edit? Can they code?

I often worry about over-stressing credentials in journalism. Sure, it helps separate the wheat and the chaff, but it also brushes over the fact that what we do just isn't that hard. We go places, talk to people, and then write it down and give to other people to read. You don't need a degree from that (as Michael Lewis aptly chronicled more than 20 years ago), and you shouldn't necessarily need an internship. As a community, we mourned the passing of David Carr, but we haven't learned the lessons he taught to writers like Ta-Nehisi Coates, about hiring "knuckleheads" and molding them into the industry we want to be. And until we do, we will still struggle to find newsrooms that reflect modern American diversity.

June 8, 2015

Filed under: journalism»professional

Paper Anniversary

It's ironic, I guess, that I was so busy at the Seattle Times a couple of weeks ago I forgot to write about my one-year anniversary here. Anyway: it's been quite a year! I've done real estate visualizations, provided an overview of Oso Valley development, and covered the Washington state elections. I did much of the development on our major investigative pieces, Loaded with Lead and Sell Block (not to mention graphs and narrative interactives for the Warren Buffett mobile home investigation). I made a Seahawks fan map so good that the team outright stole it for themselves. For the local architecture buff, I worked on a building quiz, and for the beer fans I helped build the landing page for our Brew with Us project. Want to know where the May Day protests went? I built a map for that. And this is just the big stuff.

In addition to the externally-facing development, I've been working on building tools that are used by the rest of the newsroom. I think our news app scaffolding is as good as anyone's in this business. We're leading the industry in custom element development, with responsive frames, Leaflet maps, and more. The watermarking tool I made on my second day is still in use, and will probably outlive me entirely at this point.

I have always had a low threshold for boredom, a character flaw that's led to overpacking for every trip I've ever taken and a general inability to read literary fiction. I love working in a newsroom for many reasons, but one of the greatest has always been that I am rarely bored here, and it never lasts more than two weeks when it does happen. I cannot recommend this job highly enough for technical people who want to have an impact, or journalists who want to break out of a single beat. Working at the Seattle Times has been the most fun I've had a job in a long time. I can't wait to see what the next year brings.

"As I look back over a misspent life, I find myself more and more convinced that I had more fun doing news reporting than in any other enterprise. It is really the life of kings."

--H.L. Mencken

May 20, 2015

Filed under: journalism»industry

Instant Noodles

Like all Facebook's attempts to absorb the news industry, there's a probable timeline their new Instant Articles will follow, and it basically looks like this:

  • 2015: Facebook introduces Instant Articles, in which a few media partners push their content directly into Facebook's servers, and (in the iPhone app only) it gets rendered without leaving the application. "Content," in this case, even includes the publisher's own ad and tracking systems.
  • 2016: The program expands to other publishers, albeit possibly with a few more "refinements" (read: restrictions) on what those publishers are allowed to do. It becomes fashionable in the newsroom to harass me about it.
  • Late 2016: Once Instant Articles gets some traction, Facebook finds a way to sabotage or undercut it. Either they'll introduce more restrictions on allowable features, or they'll lower the frequency at which the posts appear and charge newsrooms to "promote" them (or both — why take half measures?).
  • 2017: Noting that the magical promised ad dollars have not materialized (or are eaten up by tithes back to The Algorithm), media organizations start quietly reducing their Instant Article publishing rate. Jeff Jarvis writes a sad editorial about it.
  • 2018: Claiming it was an "educational" experiment, Facebook shuts down the program. Rumors begin circulating about its VR news platform, in which the New York Times will publish for Oculus Rift.

Instant Articles is not the first time Facebook has tried to take over the web, and it won't be the last. They're very bad at it, probably because they're the original kings of empty promises: working with Facebook is a constant stream of exasperation, until either you realize that they're incapable of maintaining a stable API/business relationship, or you slit your wrists. They've done it to game developers (goodbye, Farmville), to other newsrooms (remember Washington Post Social Reader?), and to anyone else who's tried to build on the various Facebook "platforms."

Lots of people have written very smart reactions to the Instant Articles announcement — I'm partial to Josh Marshall's behind-the-scenes take, John Herrman's spiral of bemused horror, and Zeynep Tufekci's reminder that Facebook cannot be trusted to engage honestly with its role as gatekeeper.

It's probably more fun to engage with the self-proclaimed "controversial" opinions, like this profoundly dumb thought-leadering from MG Siegler:

With Instant Articles, Facebook has not only done a 180 from what Mark Zuckerberg has called the company's biggest mistake, they've now done another lap just to prove a point.

They did a 180, and then took a lap, so... they ran the race backwards, which is a good thing? Somewhere, Tom Friedman feels a twinge of jealousy.

Not only is the web not fast enough for apps, it's not fast enough for text either. And you know what, they're right.

"They're right" that an app loading pre-cached text can be faster than a web browser downloading that same text from the network, yes. Apparently our plan now is just to restrict your reading material to what Facebook can download ahead of time. I hope you like Upworthy lists.

Though, in a way, Facebook itself really is just a web browser. It's just a different, newfangled one for a new era. A mobile era.

A different, newfangled web browser that only goes to Facebook, apparently. Who would want to read anything else? In the future, all websites are Facebook. (Ironically, according to the Instant Articles FAQ, they're fed from HTML anyway, so they're not even really that "new." But it's probably too much to expect Siegler to do research.)

Siegler's not the only person I've seen celebrating Facebook's move as an end to the open web (by which we mean HTML/CSS/JavaScript), although he's certainly one of the most gleeful (he also thinks Facebook should shut down its website entirely, in case you were wondering the general quality of his business advice). Of course, you'll notice that these hot takes are not themselves published to Facebook, or to a native app somewhere. If that were the case, no-one would have heard of them. They get posted to the web, where they can get linked and shared across social media, and read regardless of platform or hardware.

Even without without bringing in ideology, the "native apps instead of the web" idea faces a tremendous number of problems once you think about it for more than thirty seconds. How do new publications like The Toast or FiveThirtyEight get traction when you have to manually download them from an app store to read them? If they get popular through the web first, why bother transitioning to native? Nobody makes "reader" apps for desktops and laptops, so what happens to them? Does anyone really want to write long-form on Facebook, a service that only recently added an "edit post" button? Who cares: punditry is hard, let's go shopping!

It's easy to pick on shallow people who think Instant Articles represent a grand utopian state, but I'd also like to celebrate people who are actually building in the opposite direction. This weekend, I went to a Knight-Mozilla code convening in Portland, which included a ticket to the Write the Docs convention. I'm not a documentation writer, really, so most of the conference went right past me. But the keynote on the second day was by Ward Cunningham, inventor of the wiki, and it was a fascinating look at what it would really look like to reinvent the web.

For the past few years, Cunningham's been working on "federated" wikis, which store content on multiple servers instead of using a single database. If you link to another person's wiki page and you want to change the content, you fork it a la GitHub, and edit the new local copy (which remembers its origin) right there in your browser. You can also drag-and-drop content into a new page, if you want to merge text from multiple sources. It's pretty neat. The talk isn't online, but he did another presentation at New Relic that covers similar material.

Parts of Cunningham's pitch can sound kind of crankish, although I'm sure I would have said the same thing for the original wiki. But other parts are really interesting, such as the idea of creating a forkable attribution trail for data and reporting. Federated wikis are another attempt to decentralize and diversify the Internet, instead of walling it up behind a corporation's control. And a lot of it is inspired by the main insight that wikis had in the first place: on a wiki, you create a page by first creating a hyperlink to it, then following that link.

As a result, even though users don't directly type HTML into the window, this form of authorship is profoundly of the web, and it's the kind of thing that's never going to exist in a native application somewhere. The fact that Cunningham can experiment with adding new markup features in JavaScript — and even turn a browser into a new kind of hypertext reader, with a different interface paradigm — is what the web platform does best. Like water, it can flow, or it can crash.

And that's why it's ultimately ridiculous to act like some pre-cached news articles are the herald of a new media age. What the web gives us — a freedom for anyone to publish to everyone, a wildly cross-platform programming environment, a rich multimedia container where your plain-text article can live right next to my complex news app — is not going to be superceded by a bunch of native apps, and certainly not by Facebook. Instant Articles won't even be the future of news. Future of the web? Give me a break.

May 7, 2015

Filed under: journalism»new_media

Mayhem

I'm not sure what it says about Seattle that one of our biggest yearly events is a May Day protest that wrecks havoc across big chunks of downtown. What even competes? The Blue Angels shut down traffic on the bridges once a summer, and there's the Sea Fair downtown, but reception to those is always pretty muted in my experience. International Workers Day is the big show.

The May Day map I put together to track our reporters has quickly become one of my favorite projects for the Seattle Times. It was real-time, it posed interesting data challenges, and it really exploited our <leaflet-map> element more than anything else we've done so far. While I also wrote a post on it for our dev blog at work, I wanted to call out a couple of other interesting points here.

The most interesting technical detail here is the use of the Twitter streaming API, which delivers nearly instant updates for a search query (either on users, geolocation, or keyword). Node is a great fit for this, with the twitter module offering a readable stream that fires events as new items come in. Our scaffolding, on the other hand, is not intended to be run as a long-standing process, and I didn't really want to retrofit Grunt into a general-purpose application framework. I ended up writing the Twitter part of the app as a completely separate, continuous Node process, which then dumped out its data as a JSON file and started a standard build/deploy in a child process whenever new data arrived.

To store the tweets from the stream, the application uses a SQLite3 database, since that's the easiest way to query and update data. A static data store like this is not something that we've used on projects before, and I don't know if I'd re-use it again. Using SQLite itself is always a pleasure, but reliance on a local database means that I couldn't just clone the project from home and update it when I wanted to change the coloring on Saturday morning. Using cloud storage, like Google Sheets, has a lot of advantages for distributed and remote development.

Working with Twitter itself is an interesting problem, because it's clear that the company has no real coherent plan for outside developers. Over the last few years, the API for user access has been increasingly limited and broken as Twitter tried to drive third-party clients (which don't show ads and don't make money) out of existence. On the other hand, if you are building a Twitter bot, which our map effectively is, it remains a pretty useful and effective service for pub/sub communication. I'm not sure it says very much about Twitter's strategy that they'll let bots run wild while ordinary people are locked into a client monoculture, but that's honestly the least of my frustrations with them at this point.

All that said, I would personally use with this stack again in a heartbeat. Twitter is not the highest social traffic source for the Seattle Times, but almost all of our reporters use it anyway, and it's much nicer to program against compared to Facebook. The impending dilemma is if (or when) Twitter will decide to switch to a "curated" (read: algorithmically-tampered) stream a la Facebook's timeline. When that happens, its value to me as a news developer drops basically to nothing, because I won't be able to guarantee message delivery any more.

Which brings me to the most boring but probably most profound lesson of this project: we need a better build server. The May Day map ran on a box in the office we've affectionately dubbed "Cronda," which also currently tests our traffic alert application and previously powered the Seahawks fan map. In each of those cases, we've jury-rigged together a solution for pulling the latest source code and running builds at regular intervals (the cron Grunt task), but it's not optimal. We can't check on those builds remotely, or restart them if something goes wrong.

At some point, we'll probably move our builds from Cronda to an EC2 box that we can access remotely, but doing so doesn't honestly solve the problem — it just makes it less fragile. Eventually, I think we'll need to look into a real build monitor like Jenkins, which can automate deployments, track error logs, and respond to queries in our teach chat. I'm not entirely looking forward to that, since it feels like a very heavyweight solution, but the more complex our applications get, the more a little up-front rigor will pay off.

April 23, 2015

Filed under: journalism»professional

Winner of 10

It has been a busy week, but I wanted to take a moment to recognize my colleagues at The Seattle Times for their tremendous work, resulting in a 2015 Pulitzer Prize for breaking news journalism. Their coverage of the Oso landslide was clear, comprehensive, and accurate, and followup work continues to this day (including one of my first projects for the paper). It's very cool to be working in a newsroom that's the winner of 10 Pulitzer Prizes over the years, and I'm looking forward to being here when we win #11.

April 16, 2015

Filed under: journalism»new_media»data_driven

Empty-handed

We have sent several people from the newsroom, including myself, to journalism conferences over the last few months. Most conferences are about 50% inspirational and 50% crap (tilted heavily crap-wards in the keynotes), but you meet good people and you get to see the nuts and bolts behind the scenes of some of the best interactive news stories published.

It's natural to come back from a conference with a kind of inferiority complex, and equally easy to conclude that we're not making similar rich presentations because we don't have the cool tools that those other (richer, more tech-savvy) newsrooms have. We too, according to this train of thought, need to be coding elaborate visualization generators and complicated new CMS features — or, as Ryan Pitts from Mozilla said to me last weekend at the Society for News Design workshop, "let's not rest until every paper in the country has built its own charting application."

I think better newsroom tech is important, but let's play devil's advocate for a bit with an unpopular hypothesis: developing tools for the editors and reporters at your newspaper is a waste of your time, and a distraction from the journalism you should be doing.

Why a waste of your time? Partly because newsroom tools get a lot less uptake than you probably think they do (certainly less than we'd hope they would). I've written a lot of internal applications in my time, and they've never been particularly popular, because most reporters and editors don't care. They're too busy doing journalism to use your solution (which is as it should be), and they are probably not big on technology anyway (I have a lot of reporters who can't use Excel, which pains me greatly). Creating tools for reporters is, most of the time, attacking the problem at the wrong point.

For many newsrooms, that wasted time will end up being twice as expensive, because development resources are scarce and UI is hard. Building a polished, feature-filled chart generator that the average journalist can use will take at least a couple of programmer months, which is time those developers aren't working on stories and visualizations that readers want. Are you willing to sacrifice that time, especially if you can't guarantee that it'll actually get used? That's a pretty big gamble, unless you have the resources of the New York Times. You're probably better off just going with an off-the-shelf package, or even finding a simpler solution.

I don't think it's a coincidence that, for all the noise people make about the new data journalism startups like Vox and FiveThirtyEight, 99% of their chart output does not come from a fancy tool or a complex interactive: they post JPEGs. And that's fine! No actual reader has ever complained about having to look at a picture of a graph instead of a souped-up vector rendering (in Vox's case, they're too busy complaining that the graph was stolen from someone else, but that's another story). JPEG is a perfectly decent solution when it comes to simply telling the story across the entire web platform — in fact, it's a great embodiment of "do the simplest thing that works," which has served me well as a guiding motto in life.

So, as a rule of thumb: don't build charting libraries. Don't build general-purpose databases. Don't build drag-and-drop slideshows. Leave these things to other people, who have time and energy to build them for a living. Does this mean you shouldn't create tools at all? No, but the target audience should be you, the news developer, and other semi-technical newsroom staff like the web producers. In other words, make technology for the people who will actually use it, and can handle something that's not polished to a mirror sheen.

I believe this is the big strength of web components, and one reason I'm so bullish on them at the Seattle Times. They're not glossy, end-user products, but they are a great balance between power and accessibility for people with a little technical skill, and they're very fast to build. If the day comes when we do choose to invest in a slicker newsroom app, we can leverage them anyway, the same way that the NYT's fancy chart designers are all based on the developer-oriented D3 library.

In the meanwhile, while I would consider an anti-tool stance a "strong opinion weakly held," I think there's a workable philosophy there. These days, I feel two concerns very strongly (outside of my normal news/editorial production, of course): how to get the newsroom to make use of our skills, and how to best use the limited developer resources we have. A "no tools" guideline is not an absolute rule, but it serves as a useful heuristic to weed out the kinds of projects that might otherwise take over our time.

March 27, 2015

Filed under: journalism»professional

Construction and architecture

In the last couple of weeks, a few more of my Seattle Times projects have gone live — namely, the animated graph in this story about EB-5 visa growth, and the Seattle architecture quiz. Both use the FLIP animation technique I wrote about a few weeks ago, although it's much more elaborate in the EB-5 graph, which animates roughly 150 elements at 60fps on older mobile devices.

In the case of the architecture quiz, I also added the Babel compiler (formerly 6to5), which turns ES6 code into readable ES5 JavaScript that the average browser can run. Although it's not an enormous change, looking through the original source code will show the new object literal syntax, template strings, and (my personal favorite) fat arrow functions, which do not rebind this and offer a lighter-weight syntax for array sort, map, and filter operations.

I'm not sold on all of the changes in ES6 — I think let is overrated, and the module syntax is pretty terrible — but these changes are definitely a positive step that reduces much of the boilerplate that was required for modern JavaScript. Most importantly, one of Babel's big advantages is that it produces readable output, compared to previous compilers like Traceur, so that even without source maps it's easy to debug. We've added Babel as a part of the default build step in the Times news app template, so if you're looking to try it out, there's no better time than now.

March 11, 2015

Filed under: journalism»new_media

Template Trouble

About nine months ago, I made the first check-in on the Seattle Times news app template. Since that time, it's been at the heart of pretty much everything we've done at the Times, ranging from big investigative projects to Super Bowl coverage to dog name analysis. We've adapted it to form the basis of our web component stack, and made a version that automates Leaflet map creation. It's been a pretty great tool, used by news apps developers, producers, and graphics team members alike.

That said, I think in digital journalism we often talk in glowing terms about our tools, but we don't nearly as often discuss the downsides they possess. So let's be honest with ourselves: I love this scaffolding, but it's not perfect. It has issues. And I think those issues say interesting things about not only the template itself, but also newsroom culture, and the challenges of creating tools that can operate there.

  • The templating situation can be confusing. Since it's all JavaScript, it's sometimes hard for scaffolding users to keep track of what's running during the build, and what will run on the client. Generally, we use a different library in each scenario (Lo-Dash during builds, ICanHaz or DoT in the browser), but it can still be odd for people who are used to a language split — and worse for those who have little or no programming experience.
  • Deployment could be better. This really has less to do with our scaffold, and more to do with the environment in which it operates. We don't have great CMS integration, because the hooks don't exist. And we have to keep credentials in a separate file (which isn't checked into Git), because many of users can't update environment variables on their own machines. We're also still trying to figure out what we check into Git: should Google Sheets go in there? What about their ID numbers?
  • It was great when the paper launched its new responsive site last month, because it meant we finally have reasonable default styles. The news app scaffolding has previously left these up to the project authors, and the result is that we're not nearly consistent enough. I think we have a fine line to walk between "build everything in" and "provide flexibility" — what's good for the main site may not be good for us.
  • Along those same lines, the new CMS offers us a better, responsive layout, but it also took away a lot of flexibility. The result is that we're probably overusing the news app template to compensate. While I think it's great that we have a place for generating unconventional pages, I'm not wild about effectively creating a parallel content system on S3 whenever we need a small amount of control over the page.
  • Old apps are locked to old dependencies. Like any good Node package, we load dependencies for the news app template on a per-project basis. But I've been tinkering with this framework for 8 months now, and several things have changed radically (most notably, a switch from RequireJS to Browserify). Stepping back into old projects often requires a bit of code archaeology to figure out where everything used to live.

What are the common threads here? While you could point to the static page approach as being part of the issue, I actually think what causes a lot of these problems is that the intended audience for the news app template is both broad and narrow. It's broad in that its users range from novice journalists to experienced developers (and, indirectly, non-technical editors and reporters feeding data into Google Sheets). It's narrow in that the actual production still requires a high level of technical comfort: familiarity with the command line, new kinds of tooling, and some ability to roll with unexpected bugs.

This is a tough, and self-contradictory, audience for a visualization toolkit. It's not, however, out of character for a general-purpose dev framework. And indeed, when we talk about app scaffolds from any news organization (not just The Seattle Times), that's what they are. They're written to be fast, to be portable, and to generate static files, because those are our priorities as deadline-driven journalists. They are also the far end of a range of newsroom tools, where news apps are at one end and pre-built widgets live on the other. I'm not really worried about where the template lives on that range, and I'm certainly not planning on reducing the complexity — I think it's at a sweet spot right now. But I do worry about the ways that it (and our CMS) fit into newsroom culture.

At the Times, like in many newsrooms, the online presence is largely run by "producers," who curate the stories on the home page and handle the print-to-digital transition process (it's not the same as a "producer" in software development). This process is complicated and highly-skilled, because news CMS systems are generally terrible. The web production staff also often work on projects that would, in print, fall under page design: building complex HTML presentations for special stories. This isn't because they're trained designers: producers are often younger, and while it's not entry-level work, it's close. They end up doing this work because trained, HTML-fluent designers are rare, and because nobody else in the newsroom bothers to learn web design.

As a result, we end up in a funny situation: the only people in the newsroom who really understand the web are the producers. Editors and reporters are discouraged from becoming more technically savvy because the workflow is print-first, and the CMS is so intimidating. Meanwhile, producers rarely become editors or reporters because the newsroom can't afford to lose their skills. There's a tremendous gap in newsroom culture between people who produce the content, and people who actually understand the medium in which that content is consumed. While the tooling is not entirely responsible for that, it is a contributing factor.

I think the challenge we face, as newsroom developers, is to be always aware and vigilant of that gap and its causes. Tools like the news app template are important, because they speed up our work, and the work of other technical people. But they don't mitigate the need for better, web-first publishing systems — something that can help diffuse web thinking from a producer-only skill to something that's available throughout the newsroom.

February 5, 2015

Filed under: journalism»industry

What is Data Journalism?

This week, if you want to be horrified by our grim meathook future, check out these posts from Seattle Times news librarian Gene Balk on vaccination rates at Washington State schools. There's a searchable data table and a map, but I'll spoil it for you: a large proportion of parents should probably pack surgical masks and antibiotics with their kids' lunches, because herd immunity is basically a thing of the past.

This kind of database-driven reporting is a staple of Gene's "FYI Guy" blog, and readers seem to enjoy it. Done right, it can help flesh out local coverage in interesting ways, explore topics that are off the beaten path, and find connections that we might otherwise miss. That said, I don't think you can stress enough how much of that depends on the quality of the reporter: Gene is a great researcher, and not everyone has his skills and experience.

By coincidence, yesterday Melissa Bell at Vox announced that they're (re)entering the field of data journalism in a almost parodically-titled post. I'm a little confused about the timing, since I thought data journalism was a part of their whole raison d'etre, but maybe I'm confusing them with a different scrappy, SEO-oriented news startup. Regardless, welcome to the party! After name-checking Philip Meyer's Precision Journalism, Bell adds a list of nine basic guidelines they plan to use. It's not a bad list, although several items are inoffensively bland (has anyone ever aspired to produce content that isn't "relevant and useful?").

  1. Vox will work to provide the most relevant and useful data behind the news, when you need it, in ways that help you understand the stories that matter most.
  2. We will work to make all the data behind our stories available to you to download and play with for yourself.
  3. We want you to improve on what we’ve done, to play with the data, visualize it, and help us analyze it — and make our work better.
  4. We will prioritize building data sets that can feed many stories, rather than focusing on one-off projects.
  5. Our data visualizations will be clear, concise, and deep — to help you understand our editorial better. They will adhere to design rules which ensure their accuracy and transparency.
  6. In the event we make a mistake (they do happen), we will swiftly and clearly clarify, correct, and communicate that as transparently as we can.
  7. We will curate and showcase the best data infographics and visualizations on the web.
  8. Visualizations we produce in-house will work well on as many platforms as possible: if you view it on a smartphone, it will function as well as it does on web.
  9. We will curate and publish the best content that our community of readers produces. Our data journalism is as much about you, the community, as it is about us: this is a partnership.

Some of these goals are particularly strong, and we share them at the Seattle Times. Take #2, for example: not only do I think it's important that we publish the data on which our visualizations are built whenever possible, but we also open-source our graphics so that people can see the methodology we used. It's also just good sense to be mobile-friendly (#8), although I personally believe that there are some times when a story simply can't be fully told on a 4" screen.

I'm less sure about curation, either from readers (#9) or around the web(#7), particularly in conjunction with accuracy and corrections (#6). One of the strengths of a newsroom is supposed to be fact-checking, but it's not clear to me what the process is for verification of third-party visualizations, or if Vox plans to do so at all (it hasn't been evident to me as a reader that they do it now). Which is too bad, because I think a kind of real-time "Snopes for bad reporting" is a site I'd definitely support.

But I'm really most skeptical of #4, which Bell elsewhere refers to as "finding, cleaning, and setting up data streams so that they can be the source for repeated stories." It's not that I think it's necessarily a stupid idea. I'm just not sure that it's effective, based on my experience. Data stories are just reporting. Data streams are reporting on top of engineering on top of reporting.

CQ's Economy Tracker, for example, was my team's attempt at a reusable data API, but it turned out to be a frustrating experience to keep it topped off with up-to-date content, the architecture was a hard problem to solve, and the number of stories we pulled out of it probably didn't justify the effort. It turns out that it's hard to find a data set that can actually support a series of articles.

(You may say, at this point, hang on a minute: wasn't Congressional Quarterly an example of exactly what we're talking about? It's a large, data-oriented news organization that sold access to data streams, and maintained datasets that were used to build stories and interactives via the multimedia team. Which is true, but it elides a number of factors: CQ was a single-purpose news site — congress and legislation only — with a huge number of reporters feeding the beast and a large technical staff to tend to it. Vox does not have those advantages, since it's a general-audience, international news site with a much smaller staff.)

More importantly, a "data stream," like an API, demands maintenance which quickly becomes a drag on the amount of time that can be spent on efforts outside those streams. That's doubly true if you make them public, and people start relying on them. Will will Vox sunset these data streams, if they stop being useful internally? What are the cutoff criteria? How will they let people know before the source is shut down? Most importantly, how much time will be taken away from reporting to maintain the data products?

When I joined at the Seattle Times, I made a pitch to editors that was a little different: instead of designing long-running services, we generally build news apps that are scoped to a specific point in time. In other words, we make stories, the same as the rest of the newsroom does. And just as you wouldn't normally ask a reporter to go back and update all their old stories when new events happen, we don't maintain news apps more than a week or two after publication (barring, of course, normal corrections and serious bug-fixes). Our entire development stack, in fact, is based on this assumption — that's why we publish static files to S3 (which is cheap and easy), instead of running a Rails/Laravel/Node server (which is expensive and hard).

Maybe for Vox, this isn't a problem. After all, they're the people with the "poor man's Wikipedia" card stacks that they maintain for topics over many months, and the evergreen experiments. At the very least, though, it does highlight a very real distinction that goes (in my opinion) beyond "data journalism" and to the core of the digital news mission. Are we building general systems and tools to cover unique stories? Or are we optimizing for semi-predictable products built around APIs and data sources? I'm leaning toward the former because I think it's a better match for a messy, unpredictable, human world. But best of luck to Vox with the latter.

Future - Present - Past