Mile Zero :: Latest posts in /tech/

September 3, 2015

Codes of Conduct

Rachel Nabors writes that that you literally cannot pay her to attend a conference without a code of conduct:

When I promised not to go to conferences without Codes of Conduct, I wasn’t paying lip service to a trend, doing the popular thing to gain brownie points with my feminist besties. I meant every word. It is my greatest fear in life that something bad would happen to someone attending a conference I attracted them to.

It was weird the other day to realize that I'm actually becoming a mentor figure for some people: I have interns, I teach classes, I'm now the co-organizer of the local Hacks/Hackers chapter. I'm still basically nobody, but I will also make this pledge: if a conference does not have a code of conduct (or its equivalent), I won't go, as a speaker or an attendee. It's important to me that industry gatherings be places where people are comfortable and safe, no matter who they are. Everyone deserves that much consideration, and while a code of conduct is not a guarantee of safe behavior, it's a good start.

It's disappointing, but perhaps to be expected, that several high-profile white dudes have decided that the most important thing they can do this year is fight against codes of conduct. Many of them feel attacked and want to circle the wagons instead of listening to voices from the community. That's too bad for people who attend conferences — but part of the point of the pledge is that it should punish organizers who don't think creating a standard framework for conference behavior is a priority. If they can't get speakers or attendees to come, sooner or later they don't have an event.

Let's be clear: nobody is entitled to run a conference, and helping people with their careers once upon a time does not excuse you from being a good citizen. If the pressure causes people like Jared Spool and Mike Monteiro to change their mind, everyone benefits. If it doesn't, and they go the way of the dinosaurs, well... that's how evolution works. We'll survive without them.

But I think this is a great opportunity to re-examine one particular community role that I see a lot, both in tech and outside. You know the one: that guy (almost always a man) whose schtick is being the "honest teller of truth," which really means "rude, petty, and abusive in kind of a funny way about stuff we all agree on." A lot of times, we tolerate that behavior because honestly, it is satisfying to have someone saying what we're all thinking about clients who won't pay, or bad design, or ugly code. I have sometimes thought of myself as that person, but I'm trying not to be anymore.

The problem with the "angry truth-teller" is that it stops being funny when they suddenly turn those tools on their allies. When that happens, you realize that it was never actually funny: you just didn't care about being respectful, because you didn't think the target was worthy. Unfortunately, a lack of empathy is not the same as a sense of humor. There might be a thin line between "comedian" and "jerk," but unless you're Don Rickles I don't really see the point in intentionally crossing it.

In the end, life is too short to give money and attention to people who can't have a little empathy over something as silly as tech conference administration. We are not required to make them leaders in our community, nor are we required to keep them as leaders if we decide that the negatives of their input outweigh the positives. There are lots of technically-skilled people out there who can be right about an issue — or wrong on it, even — without being entitled, nasty jackasses about it. Let's boost them instead.

15:27 x permalink

June 25, 2015

A good (virtual) walk spoiled

Earlier today I took the wraps off of the private repo for our Chambers Bay interactive flyover. You can find the source code here, and a post on our dev blog about it here. It was a really fun challenge, and a rare example of using WebGL in a news capacity (the NYT did the Dawn Wall, but that's the only one I can remember recently).

From a technical standpoint, this was my first three.js project, and the experience was largely positive. I think there's a strong case to be made that three.js is basically jQuery for WebGL: sure, you don't need it, but it only takes a couple of features to make it worthwhile. In this case, I didn't particularly feel like writing a model loader or a scene graph. There are still plenty of hooks to write the parts that I do enjoy, like the fragment shader for the landscape (check out that sweet dithering), or the UI for directing the camera. Sure, three.js is a relatively large library, but I'm loading 4MB of textures and another 4MB of gzipped landscape model, so what's a few more hundred KB of code?

WebGL itself runs surprisingly well these days, although failure modes do not seem to be its strong suit. For example, the browser may have WebGL support enabled, but then crash when it tries to render (or it may be lying about support, as with the remote VM sessions used in Times meeting rooms). That said, I was astonished to find that pretty much everything (mobile included) could run the landscape at a solid frame rate, despite the fact that it's a badly-optimized mesh with 150,000 triangles. Even iOS, which usually falls over and dies when WebGL pushes past its skimpy RAM limits, was able to run smoothly once I added a low-res texture for it to use.

This was an ambitious project using some pretty cutting-edge web technology, which makes it interesting in light of arguments that the web suffers from "featuritis". After all, when you're talking about feature overkill, WebGL is a barnstormer. But this story would have been tough to tell another way, and it would have never had the same reach siloed in an app store.

Or take Paul Ford's mind-boggling What is Code? in Businessweek this month: behind Bloomberg's Trapper Keeper design aesthetic, it's a powerful article that integrates animations, videos, and interactive demonstrations with the textual message. Ironically, I saw many of the same people that criticize web apps going wild for Ford's piece, a stance that I can only attribute to sophistry.

My team at the Seattle Times has gradually abandoned the term "news apps," since everyone who hears it assumes we're actually writing iPhone clients for the paper. As a term of art, it has always been clumsy. But it does strike at a crucial quality of what we do, which occupy a gray area between "text" and "program." And if it seems like I'm touchy about pundits who think we should abandon the web, this is in large part the reason why.

Arguments that browsers should just go back to being a document viewers ignore the fact that HTML is not just a text format: it's a hypermedia format, and those have always blurred the already-fuzzy lines between data and code (see also: Excel, Hypercard, and IPython notebooks). It's true that the features of the web platform are often abused. Nobody likes slow navigation, ad popups, or user tracking scripts. But it's those same features that make new kinds of storytelling possible — my journalism is built on the same heavily-structured, "over-tooled" web platform that critics find so objectionable. I wouldn't give that up for all the native apps in the world.

1:16 x permalink

April 3, 2015

What's advanced?

Next week, I'll start teaching ITC 298 at Seattle Central College. It's the first special topics class I've actually gotten to teach (a previous class on tooling couldn't get enough students), and it's on a topic near and dear to my heart: namely, intermediate to advanced JavaScript. But what does that actually mean? If web development were a medieval blacksmith shop, what would you need to know in order to move beyond "apprentice" and hit the road as a journeyman? What is it that separates a jQuery dabbler from a serious front-end coder?

The honest answer is "about five years of practice," but that's not the whole story (nor is it something I can take to a curriculum planning committee). I think there are two areas of growth that students need to be aware of, and that I'm planning on stressing for this quarter: tooling and functional programming.

Tooling

Rebecca Murphey recently wrote a revised baseline for front-end developers, just as I was putting together my syllabus. It's still a great guide to the state of the art, even if I don't agree with everything in it. But I would add that modern JavaScript applications tend to be built on three pillars that students need to understand:

Server code written in NodeJS,
Client code built on top of some kind of MVC, and
A build system (also in Node) that weaves the two together.

Learning their way around this trio is going to be a huge challenge for my students, most of whom still live in a world where individual files are edited and sent to the browser as-is (possibly with a PHP include or two). They haven't built applications with RESTful routes, or written client-side code in a module system. SCC hasn't typically stressed those techniques, which is a shame.

I'm happy to be the person who forces students into the deep end, but I do want to make sure they have a good, structured experience. Throwing everything at students is a quick way to make sure that they get overwhelmed and give up (not a hypothetical scenario: the previous ITC 298 class had exactly that problem, and ended poorly). To ease them in, we'll try building the following sequence of exercises in our directed lab sessions:

simple Node script with callbacks
scraper using async/events/streams
basic site using Hapi.js
authenticated site with session-handling
Grunt scripts to build JavaScript with Browserify
simple MVC with Backbone

The progression starts with Node, and then builds out gradually so that each step conceptually depends on a previous lesson. Along the way, students will learn a lot about how to structure an application across all three of these environments — which brings us to the second, and probably harder, focus of the class.

Functional programming

Undoubtably, students need to have a better grasp of JavaScript fundamentals ("the good parts") if they're to be considered intermediate front-end devs. But how do we break down those fundamentals? We could concentrate on inheritance and object orientation, or think of everything as MVC. Maybe we could spend a bunch of time on modules, or deep-dive into how to write high-performance DOM code. Murphey recommends learning ES2015 features, like fat arrows and destructuring. All of these are important pieces of the JavaScript toolkit, but they are more patterns available for use, not core theoretical knowledge.

The heart of the language, however, remains the humble function. Where other languages have modules, private/static properties, blocks, async/await, classes, and list comprehensions, JavaScript just has first-class functions. Astonishingly, this has actually worked out pretty well, but it means you do really need to understand them in order to read and write code — particularly on Node, where callbacks are still the preferred method of handling concurrency.

Typically, functional coding has been one of the more difficult concepts to introduce in my basic class: we spend some time about halfway through the quarter implementing Array.forEach(), and we wire up a lot of event listeners. Map/reduce usually overwhelms most students, however, and call/apply gets a lot of blank stares. These are literally unavoidable in a well-written, modern JavaScript codebase: we have to find a way to approach them if students are to reach the next stage of their professional development.

The hard part of writing for Node is that you must embrace some degree of functional programming: the continuation-passing style used in the core APIs makes it inescapable. But the great part of writing for Node (especially as the first section of the course) is that it's actually a fairly gentle ramp-up. Callback functions are not that far from event listeners, and the ubiquitous async library softens the difficulty of mapping an array functionally. Between the two, there's no shortage of practice, since there's literally no other way to write a Node program.

That's the other strategy behind the tooling sequence I've laid out. We'll start from Node, and then build toward increasingly complex functional constructs, like modules, constructors, and promises. By the time the class have finished their final projects, they should be old hands at callbacks and closures, which will serve them well in almost any language.

The plan

Originally, I joked with people that we'd spend six weeks reading JavaScript: the Good Parts and then six weeks building a chat application. Two students would have enjoyed this, and the rest would have followed me home and murdered me in my sleep. But the two-part plan laid out in this post hopefully marries the practical and the theoretical in a way that will help students grow. They'll learn about the intricacies of JavaScript's functional quirks, but they'll do so by building real applications to solve real problems.

The specifics of this quarter are still a little bit in flux, and will likely remain so, since I think it's good to be flexible the first time teaching a class. But if you're interested in following along, feel free to check out the class repo, which contains the syllabus, supporting materials, and example code so far. Issues and pull requests are also welcome!

16:31 x permalink

February 18, 2015

Speed Kills

It's an accepted truth on the web that fast pages are better for users — people stay on them longer, follow more links from them, and generally report being happier with them. I think a lot about performance on my projects, because I want readers to be thinking about the story, not distracted by slow load times.

Unfortunately, web performance has a bad rap, in part because it's a complicated topic. Making it work effectively and efficiently means learning a lot about how the browser runtime works, and optimizing for new techniques like GPU transforms. Like everything else in web development, there's also a lot of misinformation out there, and a lot of people who insist that everything was better back when we built everything without all the JavaScript and fancy-pants frameworks.

It's possible that I've been more aware of it, just because I've been working on a project that involves smoothly animating a chart using regular HTML instead of canvas, but it seems like it's been a bad month for that kind of thing. First Peter-Paul Koch wrote a diatribe about client-side templating, insisting that it's a needless performance hit. Then Flipboard wrote about discarding traditional elements entirely, instead rendering everything to a canvas tag in pursuit of 60 frames/second animations. Ironically, you'll notice that these are radically different approaches that both claim they create a better experience.

Instead of just sighing while the usual native app advocates use these posts to bash the web, and given that I am working on a page where high-performance mobile animations are a key part, I thought it'd be nice to talk about some experiments I've run with the approaches found in both. There are a lot of places where the web platform needs help competing on mobile, no doubt. But I'd prefer we talk about actual performance problems, and not get sidetracked into chasing down scattered criticisms without evidence.

Let's start with templating, which is serving as a stand-in for client-side JavaScript in general. PPK argues that templating (and by extension, single-page app design) is terrible for performance, but is that true? While I was working on my graph, I worried a little bit about startup time. Since I write JavaScript on both the server and the client, it was pretty easy to port my code from one to the other and check. I personally found the results conclusive, and a little surprising.

The client-side version of the page weighed in at 10KB and spent roughly 35ms in JavaScript during startup, rendering the page and prepping its data structures. That's actually not bad for something that's doing some fairly heavy positioning and styling, and it fits in the 14KB first TCP round-trip recommended by Google. In contrast, the server-side page, in which all the markup was pre-rendered and then progressively enhanced after page load, was 160KB and spent about 30ms in JavaScript. In other words, following PPK's advice to avoid client-side templating caused the page to be sixteen times larger, and still required two video frames to start up.

Now, this is a slightly special case: unlike typical server applications, my news apps are useless without JavaScript. They're not RESTful, they don't talk to a database, they involve a lot of moving parts. But even I was surprised by how little impact client-side templating actually had. Browsers these days are just ridiculously fast at assembling HTML. So while I don't recommend doing the entire page this way, or abandoning server-generated HTML entirely, it's pretty clear to me that it's not the slam-dunk case that holdouts for traditional server rendering claim it is.

At the other extreme is Flipboard's experiment with canvas rendering. Instead of putting everything in the document, like normal websites, they put a full-screen canvas image up and render everything — text, images, animations, etc. — manually to that buffer. You can try a demo out on your device here. On my Nexus 5, which is a reasonably new device running the latest version of Chrome, it's noticeably choppy. My experience with canvas is that Chrome's implementation is actually much faster than Safari, so I don't expect it to be smooth on iOS either (they've blacklisted tablets, so I can't be sure).

In order to get this "fluid" experience, here's what the Flipboard team threw away:

Accessibility: nothing on the page actually exists to a screen reader. Users that invert their screens or change the text size to make it easier to read are out of luck.
Copy and paste: since there's no document, there's nothing to select for these basic text operations.
Real links: you can't open pages in a new tab. You can't share them using the OS share panel. You can't do anything with the links, because they aren't really there.
GPU acceleration: by doing everything in canvas, they've ignored all the optimizations that browsers actually do to ensure a smooth, battery-efficient experience.
View source: inspecting the Flipboard page gives you no information at all, and the JavaScript is minified without source maps. The app is completely opaque to anyone who wants to learn from it.

The irony of doing all this work for a fluid experience that isn't actually fluid is that the kinds of animations they're doing — transform and opacity — are actually the exact properties that browsers can animate at 60FPS. Much has been written about using GPU compositing for smooth animations, but this article is a great start. If Flipboard had stuck with the DOM and used the GPU fully, they probably could have had fluid animations without leaving all those other features behind.

That's an easy thing to say, but is it true? Here's another experiment from my stacked bar chart: when a toggle is pressed, the chart shifts from being a measure of absolute numbers to relative proportions, with each bar smoothly animating up to 100%. I'm using an adaptation of Paul Lewis' FLIP technique, in which animations are set in JavaScript but run via CSS transitions. In my case, each of the 160+ blocks is measured, assigned a transform to "freeze" it in place as a new GPU layer, then transitioned to its final position with a second transform and "thawed" back into a regular, responsive element.

Even though I'm animating many more elements than Flipboard is doing in their demo, the animation is perfectly smooth on my Nexus 5, and on the aging iPad we use for testing. By doing all the hard computational work up front, and then handing the pre-computed transitions over to the browser, I'm actually not JavaScript-bound at all: everything is done on the graphics chip, and in the C++ compositing layer. The result is a smooth 60FPS during the animation, all done via regular DOM elements. So much for "If you touch the DOM in any way during an animation you’ve already blown through your 16ms frame budget."

Again, I'm not claiming that my use case is a perfect analogue. I'm animating a graphic in response to a single button press, and they're attempting to create an "infinite scroll" (sort of — it's not really a scroll so much as an animated pager). But this idea that "the DOM is lava" and touching it will cause your reader's phones to instantly burst into flames of scorn seems patently ridiculous, especially when we look back at that list of everything that was sacrificed in the single-minded pursuit of speed.

Performance is important, and I care deeply and obsessively about it. As a gamer and a graphics nerd, I love tweaking out those last few frames per second, or adding flashy effects to a page. But it's not the most important thing. It's not more important than making your content available to the blind or visually impaired. It's not more important than providing standard UI actions like copy-and-paste or "open in new tab." And it's not more important than providing a fallback for older and less-powerful devices, the kind that are used by poor readers. Let's keep speed in perspective on the web, and not get so caught up in dogma that we abandon useful techniques like client-side templating and the DOM.

14:38 x permalink

November 12, 2014

grunt-init component

Last week, I wrote a little bit about using custom elements for our election pages. Being able to interact with SVG maps using a simple DOM interface, while still annoying (it's SVG, after all) miles more pleasant than actually using the tags directly. At the end of that post, I recommended that newsrooms thinking about docreating new JavaScript libraries look into Web Components — or at least custom elements. This week, I've got a way to make good on that pitch.

Similar to our news app template, I've put together a Grunt scaffolding for creating bundled custom elements, including HTML templating and CSS, all in a single standalone file. It's our component template — or, as I like to call it, the Poor Journalist's Polymer.

As with the app template, I'm developing the component scaffolding by building projects with it and then integrating the improvements back in. The first is a responsive-frame element that serves as a smaller, easier-to-use replacement for NPR's Pym. I like Pym, and I've used it in several projects now, but it's a little buggy and the setup process is cumbersome. In contrast, the custom elements don't require any JavaScript skills: just include the script to start using them on the page, and they'll connect up with the child elements on the other side of the iframe automatically.

My second testbed project is a Leaflet map element that uses custom HTML to set the map configuration without ever writing a line of JSON (unless you really want to). It's intended to make mapping simple and fast for web producers, while still offering plenty of power for people like me who just want the boilerplate out of the way. Leaflet's a great candidate for this kind of declarative approach, and I think this is a really promising demo for the power of custom elements.

For standalone components like these, the template seems to be working well. I haven't yet solved the problem of easily embedding them in highly opinionated news apps, due to the way that dependencies are handled. It's useful for custom elements to be able to bundle their CSS and other assets into their package, similar to the way that HTML imports and shadow root offer embedded styles, but that means they may not integrate well into projects that already have their own build system. As far as I can tell, the best solution for now will probably be to load the packages from Bower and require() the standalone files from its build directory, which should work with whatever module system you like.

But to be clear, the component template isn't really intended to solve those problems. Its goal is to simplify and modernize the kinds of scripts that, even now, people tend to solve with a jQuery plugin. I'd like to change that, so that more newsrooms produce reusable HTML elements instead of JavaScript spaghetti code. If you build something interesting with the component template, or if it inspires you to make your own, please let me know!

12:50 x permalink

October 3, 2014

WebGL and beyond

We came very close to using WebGL for a Seattle Times special report that will come out next week. Now that iOS 8 has shipped with support for WebGL, albeit in an unstable and slightly buggy form, it's common enough that I felt comfortable using it (with a scaled-down 2D fallback) for our audience. In the end, we went with a different design language and shelved the WebGL experiments, but the experience has left me very excited about the potential for mainstream usage.

It's probably easier to understand why WebGL is exciting by looking at what the regular 2D canvas does badly. 2D canvas is terrible at combining or masking its rendering functions (globalCompositeOperation in particular is dog-slow in Firefox). It doesn't give users easy access to the image data directly, which is frustrating in a bitmap-based drawing API. It doesn't like changing colors or styles frequently. But its biggest weakpoint is actually that it's tied so closely to JavaScript, which is a single-threaded language running in the browser event loop. The more pixels you touch in detail, the slower it gets.

WebGL, by contrast, is great at blending, filtering, and masking. But most importantly, WebGL code moves most (if not all) of the math-heavy graphics code you'd normally write in JavaScript — scaling and transforms, patterns, and color — over to the GPU. Your graphics card is a massive parallel-processing machine, so all your drawing occurs simultaneously, not sequentially. You can alter every pixel in the frame, if you want, and it'll barely take any more time than if you change only a few.

Once I spent a little time writing some simple shaders, I realized that there's a whole range of experiences you can write in WebGL that simply aren't possible on a 2D canvas. I could shift the colors or custom-filter an image on a per-pixel basis. I wrote a dust simulation that simulated thousands of motes on a low-end machine, even with the physics still running in JavaScript. I even created a faux-3D effect a la Depthy, by displacing each pixel by the value of a second texture's lightness and the mouse position.

None of these experiments involved 3D math in any way. They're not spinning teapots, or Unreal Engine demos, or elaborate parallax effects. I suspect that the real value of WebGL isn't going to be from any of those things. It's going to be the fact that it gives the web platform the free-drawing capability of canvas, but uncoupled from the JavaScript execution model that it's been shackled to.

There's an obvious parallel here, which is the first two major versions of Android. Because it was designed to run on low-end hardware, Android drew all its UI via software until 3.0 (and hardware acceleration didn't become widespread until 4.0). The resulting lag was never as bad as critics claimed, but it did mean that a lot of Android looked and felt a bit utilitarian. You wouldn't see something like Material Design emerge until the system supported using the GPU for rendering ordinary UI.

It's not a coincidence that Google's moving to Material Design on both Android and the web. Its design language — a smoothly-animated world of flat, geometric shapes — is attractive, but more importantly it's well-matched to the kinds of flat, geometric shapes that can be animated fluidly in a browser, using the 3D acceleration that's already built into the composition layer. Web Components will give developers a way to package those elements up, and make them reusable. Flexbox makes their layouts scalable and responsive.

But for the web platform to move forward, we need more than just a decent look and feel. We need the ability to write the kinds of applications that people insist that it can't run. WebGL is a step in that direction: graphics with near-native speed and capability, instantly deployed and paired with a surprisingly powerful UI toolkit. The kinds of apps and experiences we can write othe web, for a mainstream and mobile audience, just got a lot bigger. And I for one am looking forward to pushing those boundaries as much as I can.

23:08 x permalink

August 28, 2014

Process Over Programs

This fall, I'll be teaching ITC 210 at Seattle Central College, which is the capstone class of the web development program there. It's taught as a combined class with WEB 210 (the designer's capstone). The last time I taught this course, it didn't go particularly well: although the goal is for students to implement a WordPress site for a real-world client, many of them weren't actually that experienced with the technology.

More importantly, they had never been taught any of the development methods that let teams work together efficiently. I suggested some of the basics — using source control, setting tasks, and using a "waterfall" structure — but I didn't require them, which was a mistake. Under pressure, students fell back on improvised strategies, and many of them ended up in a crunch as a result.

For the upcoming quarter, I plan to remedy those mistakes. But to do so, it's helpful to look at the web development program from a macro level. What is that we're trying to do here, and what should this capstone class actually mean to students?

Although the name has changed, Seattle Central is still very much a community college, and this is very much a trade program. We need to focus on practical job skills, not on CS theory. And so while the faculty are still working on many of the details, one of our goals for curriculum redesign was to create a simple progression between the three web applications classes: first teach basic programming in ITC 240, followed by an MVC framework in ITC 250, and finish the process with a look at development processes (agile, waterfall, last-minute panic, etc.) in ITC 260. By the end, students should feel like they can take a project from start to finish as part of a team in an organized fashion.

Of course, just because that's what our intentions were doesn't mean that it's working out that way. These changes are large shifts in the SCC curriculum, and like steering an Oldsmobile, those take time. So while it would be nice to assume that students have been through the basics of project management by the time that they reach the capstone, I can't count on it — and even then, they probably won't have put it to practice in teams, since the prior classes are individually-graded.

To bring this back to ITC 210, then, we have two problems. First, students don't know how to manage development, because they've spent most of their time just learning how to code. Second, the structure of the class hasn't historically encouraged them to develop those skills. Assignments on the development side tend to be based around the design milestones, which makes their workload "lumpy:" a lot of waiting for design resources, followed by an intense, panicky burst at the end. This may sometimes be an accurate picture of the job, but it's a terrible class experience. Ideally, we want the developers to be working constantly throughout the quarter.

So here's my new plan: this year, ITC 210 will be organized for students around a series of five agile sprints, just like any real-world coding project. At the start of each sprint, they'll assign time and staff to tasks, and at the end of each sprint they'll do a retrospective to help determine their velocity. Grades will be largely organized around documentation of this process. During the last sprint, they'll pick up another team's site and file bugs against it as QA, while fixing the bugs that are filed against them.

This won't entirely smooth out the development process — devs will still be bottlenecked on design work from time to time — but it will make it clear that I expect them to be working the entire time on laying groundwork. It'll also familiarize them with the ways that real teams coordinate their efforts, and it will force them to fit into a common workflow instead of fragmenting into a million angry swarms of random methodology.

I tend to make fun of programmers for thinking that they're the only ones who can invent a workflow, but it's easy to forget that coordinating a team is hard, and nobody comes by it naturally. I made that mistake last time around, and although we scraped by, there were times when it was rough. This quarter, I'm not giving students a choice: they'll work like a regular software team, or they'll fail the course. It may seem harsh, but I think it'll pay off for them when it comes time to do this for a living.

11:09 x permalink

May 28, 2014

Lessons in Security

This quarter, I've been teaching ITC 240 at SCC, which is the first of three "web apps" classes. They're in PHP, and the idea is that we start students off with the basics of simple pages, then add frameworks, and finally graduate them to doing full project development sprints, QA and all. As the opening act for all this, I've decided to make a foundational part of the class focused on security.

Teaching security to students is hard, because security itself is hard. Web security depends on a kind of generalized principle that everyone is out to get you at all times: don't trust the database, the URL, user input, user output, JavaScript, the browser, or yourself. This kind of wariness does not come naturally to people. Eventually everything gets broken.

I've done my best to cultivate paranoia in my students, both by telling them horror stories (the time that Google clicked all the delete links on a badly-hidden admin page, that time when the World Bank got hacked and replaced with pictures of Wolfowitz's socks) and by threatening to attack their homework every time I grade it. I'm not sure that it's actually working. I think you may need to be on the other end of something fairly horrific before it really sinks in how bad a break-in can be. The fact that their homework usually involves tracking personal information for my cat is probably not helping them take it seriously, either.

The thing is, PHP doesn't make it easy to keep users safe. There's a short tag for automatically echoing values out, but it does no escaping of HTML, so it's one memory lapse away from being a cross-site scripting bug. Why the <?= $foo ?> tag doesn't call htmlentities() for you like every other template engine on the planet, I'll never know. The result is that it's trivial to forget to sanitize your outputs — I myself forgot for an entire week, so I can hardly blame students for their slipups.

MySQL also makes this a miserable experience. Coming from a PostgreSQL background, I was unprepared (ha!) for this. Executing a prepared query in MySQL takes at least twice as many lines as in its counterpart, and is conceptually more difficult. You also can't quote table or column names in MySQL, which means that mysqli_real_escape_string is useless for queries with an ORDER BY clause — I've had to teach students about whitelists instead, and I suspect it's going in one ear and out the other.

It may be asking a little much of them anyway. Most of my students are still struggling with source control and editors, much less thinking in terms of security. Several of them have checked their passwords into GitHub, requiring a "password amnesty" where everyone got reset. I'd probably be more upset if I didn't think it was kind of funny, and if I wasn't pretty sure that I'd done the same thing in the past.

But even if they're a little bit overwhelmed, I still believe that students should be learning this stuff from the start, if for no other reason than that some of them are going to get jobs working on products that I use, and I would prefer they didn't give my banking information away to hackers in some godforesaken place like Cleveland. Every week, someone sends me a note to let me know that my information got leaked because they couldn't write a secure website — even companies like eBay, Dropbox, and Sony that should know better. We have to be more secure as an industry. That starts with introducing people to the issues early, so they have time to learn the right way as they improve their skills.

13:45 x permalink

April 30, 2014

Company Towns

Pretend, for a second, that you opened up your web browser one day to buy yourself socks and deoderant from your favorite online retailer (SocksAndSmells.com, maybe). You fill your cart, click buy, and 70% of your money actually goes toward foot coverings and fragrance. The other portion goes to Microsoft, because you're using a computer running Windows.

You'd probably be upset about this, especially since the shop raised prices to compensate for that fee. After all, Microsoft didn't build the store. They don't handle the shipping. They didn't knit the socks. It's unlikely that they've moved into personal care products. Why should they get a cut of your hard-earned footwear budget just because they wrote an operating system?

That's an excellent question. Bear it in mind when reading about how Comixology removed in-app purchases from their comic apps on Apple devices. I've seen a lot of people writing about how awful this is, but everyone seems to be blaming Comixology (or, more accurately, their new owners: Amazon). As far as I can tell, however, they don't have much of a choice.

Consider the strict requirements for in-app purchases on Apple's mobile hardware:

All payments made inside the app must go through Apple, who will take 30% off the top. This is true even the developer handles their own distribution, archival, and account management: Apple takes 30% just for acting as a payment processor.
No other payment methods are allowed in the App Store — no Paypal, no Google Wallet, no Amazon payments. (No Bitcoin, of course, but no Monopoly money either, so that's probably fair.) Developers can't process their own payments or accept credit cards. It's Apple or nothing.
Vendors can run a web storefront and then download content purchased online in the app... but they can't link to the site or acknowledge its existence in any way. They can't even write a description of how to buy content. Better hope users can figure it out!

Apple didn't write the Comixology app. They didn't build the infrastructure that powers it, or sign the deals that fill it with content. They don't store the comics, and they don't handle the digital conversion. But they want 30 cents out of every dollar that Comixology makes, just for the privilege of manufacturing the screen you're reading on. If Microsoft had tried to pull this trick in the 90s, can you imagine the hue and cry?

This is classic, harmful rent-seeking behavior: Apple controls everything about their platform, including its only software distribution mechanism, and they can (and do) enforce rules to effectively tax everything that platform touches. There was enough developer protest to allow the online store exception, but even then Apple ensures that it's a cumbersome, ungainly experience. The deck is always stacked against the competition.

Unfortunately, that water has been boiling for a few years now, so most people don't seem to notice they're being cooked. Indeed, you get pieces like this one instead, which manages to describe the situation with reasonable accuracy and then (with a straight face) proposes that Apple should have more market power as a solution. It's like listening to miners in a company town complain that they have to travel a long way for shopping. If only we could just buy everything from the boss at a high markup — that scrip sure is a handy currency!

It's a shame that Comixology was bought by Amazon, because it distorts the narrative: Apple was found guilty of collusion and price fixing after they worked with book publishers to force Amazon onto an agency model for e-books, so now this can all be framed as a rivalry. If a small company had made this stand, we might be able to have a real conversation about how terrible this artificial marketplace actually is, and how much value is lost. Of course, if a small company did this, nobody would pay attention: for better or worse, it takes an Amazon to opt out of Apple's rules successfully (and I suspect it will be successful — it's worked for them on Kindle).

I get tired of saying it over and over again, but this is why the open web is important. If anyone charged 30% for purchases through your browser, there would be riots in the street (and rightly so). For all its flaws and annoyances, the only real competition to the closed, exploitative mobile marketplaces is the web. The only place where a small company can have equal standing with the tech giants is in your browser. In the short term, pushing companies out of walled gardens for payments is annoying for consumers. But in the long term, these policies might even be doing us a favor by sending people out of the app and onto the web: that's where we need to be anyway.

17:59 x permalink

April 25, 2014

Service as a Service

If you've ever wanted to get in touch with more people who are either unhinged or incredibly needy (or both), by all means, start a modestly successful open source project.

That sounds bitter. Let me rephrase: one of the surprising aspects of open-sourcing Caret has been that much of the time I spend on it does not involve coding at all. Instead, it's community management that absorbs my energy. Don't get me wrong: I'm happy to have an audience. Caret users seem like a great group of people, in general. But in my grumpier moments, after closing issue requests and answering clueless user questions (sample, and I am not making this up: "how do I save a file?"), there are times I really sympathize with project leaders who simply abandon their code. You got this editor for free, I want to say: and now you expect me to work miracles too?

Take a pull request, for example. That's when someone else does the work to implement a feature, then sends me a note on GitHub with a button I can press to automatically merge it in. Sounds easy, right? The problem is that someone may have written that code, but it's almost guaranteed that they won't be the one maintaining it (that would be me). Before I accept a pull request, I have to read through the whole thing to make sure it doesn't do anything crazy, check the code style against the rest of Caret, and keep an eye out for how these changes will fit in with future plans. In some cases, the end result has to be a nicely-worded rejection note, which feels terrible to write and to receive. Either way, it's often hours of work for something as simple as a a new tab button.

These are not new problems, and I'm not the first person to comment on them. Steve Klabnik compares the process to being an "open source gardener," which horrifies me a little since I have yet to meet a plant I can't kill. But it is surprising to me how badly "social" code sites handle the social part of open source. For example, on GitHub, they finally added a "block" feature, but there's no fine-grained permissions--it's all or nothing, on a per-user basis. All projects there also automatically get a wiki that's editable by any user, whether they own the repo or not, which seems ripe for abuse.

Ultimately, the burden of community management falls on me, not on the tools. Oddly enough, there don't seem to be a lot of written guides for improving open source management skills. A quick search turned up Producing Open Source Software by Karl Fogel, but otherwise everyone seems to learn on their own. That would do a lot to explain the wide difference in tone between a lot of projects, like the wide difference I see between Chromium (always pleasant) and Mozilla (surprisingly abrasive, even before the Eich fiasco).

If I had a chance to do it all again, despite all the hassle, I would probably keep all the communication channels open. I think it's important to be nice to people, and to offer help instead of just dumping a tool on the world. And I like to think that being responsive has helped account for the nearly 60,000 people who use Caret weekly. But I would also set rules for myself, to keep the problem manageable. I'd set times for when I answer e-mails, or when I close issues each day. I'd probably disable the e-mail subscription feature for the repository. I'd spend some time early on writing up a style guide for contributors.

All of these are ways of setting boundaries, but they're also the way a project gets a healthy culture. I have a tremendous amount of respect for projects like Chromium that manage to be both successful and — whenever I talk to their organizers — pleasant and understanding. Other people may be able to maintain that kind of demeanor full-time, but I'm too grumpy, and nobody's compensating me for being nice (apart from the one person who sends me a quarter on Gittip every week). So if you're in contact with me about Caret, and I seem to be taking a little longer these days to get back to you, just remember what you're paying for service.

11:09 x permalink

Past - Present - Future