This quarter, I've been teaching ITC 240 at SCC, which is the first of three "web apps" classes. They're in PHP, and the idea is that we start students off with the basics of simple pages, then add frameworks, and finally graduate them to doing full project development sprints, QA and all. As the opening act for all this, I've decided to make a foundational part of the class focused on security.
I've done my best to cultivate paranoia in my students, both by telling them horror stories (the time that Google clicked all the delete links on a badly-hidden admin page, that time when the World Bank got hacked and replaced with pictures of Wolfowitz's socks) and by threatening to attack their homework every time I grade it. I'm not sure that it's actually working. I think you may need to be on the other end of something fairly horrific before it really sinks in how bad a break-in can be. The fact that their homework usually involves tracking personal information for my cat is probably not helping them take it seriously, either.
The thing is, PHP doesn't make it easy to keep users safe. There's a short tag for automatically echoing values out, but it does no escaping of HTML, so it's one memory lapse away from being a cross-site scripting bug. Why the <?= $foo ?> tag doesn't call htmlentities() for you like every other template engine on the planet, I'll never know. The result is that it's trivial to forget to sanitize your outputs — I myself forgot for an entire week, so I can hardly blame students for their slipups.
MySQL also makes this a miserable experience. Coming from a PostgreSQL background, I was unprepared (ha!) for this. Executing a prepared query in MySQL takes at least twice as many lines as in its counterpart, and is conceptually more difficult. You also can't quote table or column names in MySQL, which means that mysqli_real_escape_string is useless for queries with an ORDER BY clause — I've had to teach students about whitelists instead, and I suspect it's going in one ear and out the other.
It may be asking a little much of them anyway. Most of my students are still struggling with source control and editors, much less thinking in terms of security. Several of them have checked their passwords into GitHub, requiring a "password amnesty" where everyone got reset. I'd probably be more upset if I didn't think it was kind of funny, and if I wasn't pretty sure that I'd done the same thing in the past.
But even if they're a little bit overwhelmed, I still believe that students should be learning this stuff from the start, if for no other reason than that some of them are going to get jobs working on products that I use, and I would prefer they didn't give my banking information away to hackers in some godforesaken place like Cleveland. Every week, someone sends me a note to let me know that my information got leaked because they couldn't write a secure website — even companies like eBay, Dropbox, and Sony that should know better. We have to be more secure as an industry. That starts with introducing people to the issues early, so they have time to learn the right way as they improve their skills.
My students were sturdy and patient guinea pigs: source control must have been a shock since many of them had only recently learned about FTP and remote filesystems. Some of them seemed suspicious about the whole "files" thing to begin with, and for them I could only offer my sympathies. I was asking a lot, on top of learning a new language with unfamiliar constraints of its own.
Midway through the quarter, though, workflows developed and people adjusted. I was no longer spending my time answering Git questions and debugging commit issues. As an instructor, it was hugely successful: pulling source code is much easier than using "view source" on hosted pages, and commenting line-by-line on GitHub commits is far superior to code critique via e-mail. I have no qualms about using Git in class again, but shortening the adjustment period is a priority for me.
Using software in a classroom is an amazing way to discover failure cases you would otherwise never see in a million years, and this was no exception. Add the fact that I was teaching it for the first time, and some fun obstacles cropped up. Here's a short list of issues students hit during the first few weeks of class:
For a start, students will be connecting to their servers over SSH to debug and edit their PHP, so I'll be teaching Git from the command line instead of using graphical tools like GitHub for Windows. This sounds more complicated, but it means that the experience is consistent for all students and across all operations. It also means that students will be able to use Pro Git as a textbook and search the web for advice on commands, instead of relying on the generally abysmal help files that come with graphical Git clients and tutorials that I throw together before each quarter.
Of course, Pro Git isn't just valuable because it's a free book that walks users through basics of source control in a friendly manner. It also does a great job of explaining what Git is actually doing at each stage of the way — it explains the concepts behind every command. Treating Git as a black box last quarter ultimately caused more problems than it was worth, and it left people scared of what they were doing. It's worth sacrificing a week of advanced topics like object-orientation (especially in the entry-level class) if it means students actually understand what happens when they stage and commit.
Finally, and perhaps most importantly, I'm going to provide an origin repo for students to clone, and then walk them through setting up a deploy repo as well, with an eye to providing the larger development context. The takeaway is not "here are Git commands you should know," but "this is how and why we use source control to make our lives easier." Using Git in class the same way that people use it in the field is experience that students can take with them.
What do these three parts of my strategy — tooling, concepts, and context — have in common? They're all about process. This is probably unsurprising, as process and workflow have been hobbyhorses of mine since I taught a disastrous capstone class last year. In retrospect, it seems obvious that the last class of the web development program is not an appropriate time for students to be introduced to group development. They were unfamiliar with feature planning, source control, and QA testing — worse, I didn't recognize this in time to turn it into a crash course in project management. As a result, teams spent the entire quarter drifting in and out of crisis.
Best practices, it turns out, are a little like safety protocols around power tools. Granted, my students are a little less likely to lose a finger, but writing code without a plan or a collaboration workflow can still be deadly for a team's progress. I'm proud that the Web Apps class sequence I helped redesign stresses process in addition to raw coding. Git is useful for a lot of reasons, like its ecosystem, but the fact that it gives us a way to introduce basic project management in the very first class of the sequence is high on the list.
Last Friday, I gave a short presentation for a workshop run by the SCCC Byte Club called "Technical Interview Mastery for Women." Despite the name, it was attended by both men and women. Most of my advice was non-gender specific, anyway: I wanted to encourage people to interview productively by taking into account the perspective from the other side of the table, and seeing the process more as a dialog instead of a confrontation.
Still, during the question and answer period, several people asked about being women in the interview process. Given that my co-presenter has many years more experience being a woman, I deferred to her whenever possible, but I did chime in when the conversation turned to interaction styles. One participant said she was ignored if she wasn't assertive enough, but was then considered unpleasant if she stuck up for herself--what could she do about this?
It's one thing, I said, to suggest ways that women should adapt their communications for a male-dominated workplace--that kind of pragmatic code-switching may well do the trick. But I think it's unfair to put all the burden on women to adapt to men. There needs to be a way to remind men that it's their responsibility to act reasonably.
The problem is that it's often difficult to have that conversation without falling afoul of the same double-standard that says women in the workplace shouldn't be too loud. Complaining about sexism tends to raise hackles--meaning that the offending statement not only goes uncorrected, but dialog gets shut down. I don't know that I have any good solutions to that, but I suggested finding ways to phrase the issue akin to Jay Smooth's presentations on How To Tell People They Sound Racist. I like to think that most people aren't trying to be sexist, they're just not very self-aware. This may be a faulty assumption.
My talk at the workshop was specifically about interviewing, but obviously this is an issue that goes beyond hiring. Something is happening between the classroom and the workplace that causes this disparity. We have a word for this--sexism--regardless of the specific mechanics. And I would love to have more discussions of those specifics, but it's like climate change: every time there's a decent conversation in a public forum about solutions, it gets derailed by people who insist loudly that they don't think there's a problem in the first place.
That said, assuming that people just don't realize when they've done something wrong, there are doubtless ways to address the topic without defensiveness. If the description "sexist" derails, I'm personally happy to use other terms, like "unprofessional" or "rude"--I'm just embarrassed that I (and others) need to resort to euphemism. We need to change the culture around this discussion--to make it clear that we (both men and women) take this seriously, including respectful responses to criticism. We can do better, and I'd like to be able to tell future workshops that we're trying.
There are lots of tools for doing diffs between two source files, but I'm not aware of any source control system (save Perforce, which we use at ArenaNet) that do a timeline view of all revisions since a file was first checked in, and none that store the entire revision history in a single, web-friendly format. This is a shame, because my goal for several parts of the textbook is to be able to "replay" the process of writing a script, to show how it develops from a few lines of simple code into larger and more functional units like functions and prototypes. It's possible that someone else has done something like this, but a cursory Google couldn't turn it up, so I made my own.
You don't have to write these files by hand, which is good, because they can get pretty nightmarish. Instead, I've written an authoring tool for putting in multiple revisions (or importing them, using the HTML5 file API), commenting them, and exporting them. Using Ace means the editor is friendly and includes source-highlighting, which is great. You also don't have to worry about writing an output parser: the TLPlayer module is not quite complete, but it's done enough to wire it up to a UI and let people flip through the file, with new lines highlighted in the output.
If you'd like to see a demo, I've started using it for the chapter on writing functions. My goal is to put at least one timelapse at the end of each chapter, so that readers can see the subject matter being used to build at least on real-world code script. By doing these as revision histories, I'm hoping to avoid the common textbook "dump a huge source example into the chapter" syndrome. I know when I see that, my eyes glaze over--I don't see any reason that it's any different for my students.
Although I don't have a license on the textbook files yet (they'll probably be MIT-licensed in the near future), you're welcome to use these two modules for your own projects, and feel free to submit patches (the serialization, in particular, could probably use some love with someone with a stronger parsing background). I'd love to see if this is useful for anyone else, and I'm hoping it will help make this textbook project much friendlier to new developers.
I haven't had a chance to do more than plan a few topics to write about here, since I've been working hard on my textbook. You can now browse the built pages on GitHub without needing to check out the repo. This is a pretty handy way to publish a website. It's not particularly attractive yet, but there are two-and-a-half chapters up so far, and more on the way. I guess those long bus commutes are good for something, right?
In addition to the text, browsing the repo's /js directory will expose a few interesting AMD modules now that I've started building the interactive parts of the book as well. Given my plans for various visualizations and live quizzes, I suspect the script package may be as interesting as the book for a lot of people by the time I'm done. Here's most of what I've got so far:
In order to build the book, you'll also need NodeJS with Grunt, grunt-contrib-less, and grunt-contrib-watch installed. In addition to using LESS and RequireJS, I've written the World's Worst Template system to reduce boilerplate. The repo contains fully-built versions of the site, so you don't need to build it to read, but it would be helpful for pull requests and testing.
As I'm working on this project, I'll commit to the repo and update GitHub. If anyone wants to file pull requests for things that are technically wrong, confusing, or need more explanation, please feel free. Contributors will get their names in a credits section, although I will have to ask that copyright be assigned to me as a precaution, in case I ever wanted this to see print. If nothing else, I hope people find it to be useful as a resource. Teaching has become one of the most rewarding parts of our move to Seattle, and I don't see any reason that should be limited to just my classroom.
There's an ongoing discussion I'm having with the other instructors at Seattle Central Community College on how to improve our web development program. I come from a slightly different perspective than many of them, having worked in large organizations my whole career (most of them are freelancers). So there are places where we agree the program can do better (updating its HTML/CSS standards, reordering the PHP classes) and places where we differ (I'm strongly in favor of merging the design and development tracks). But there's one change that I increasingly think would make the biggest difference to any budding web developer: learning source control.
Source control management (SCM) is important in and of itself. I use it all the time for projects that I'm personally working on, so that I can track my own changes and work across a number of different machines. It's not impossible to be hired without SCM skills, but it is a mark against a potential employee, because no decent development team works without it. And using some kind of version control system is essential to participating in the modern open-source community, which is the number one way I advise students to get concrete experience for their resumes.
But more importantly, tools like Git and Subversion are gateways to the wider developer ecosystem. If you can clone a repo in Git, then you now have a tool for deployments, and you stop developing everything over FTP (local servers become much more convenient). Commits can be checked for validity before they go into the repo, so now you may be looking at automated code parsers and linting tools. All of these are probably going to lead you to other trendy utilities, like preprocessors and live reload. There's a whole world of people working on the web developer experience, creating workflows that didn't exist as recent as two or three years ago, and source control serves as a good introduction to it.
The objection I often hear is that instructors don't have time to keep up with everything across the entire web development field. Whether or not that's a valid complaint (and I feel strongly that it isn't), it's just not that hard to get started with version control these days. Git installs a small Bash shell with a repo-aware prompt. GitHub's client for Windows does not handle the advanced features, but it covers the 90% use case very well with a friendly UI. Tortoise has Windows add-ons for Git, SVN, and Mercurial. Learning to create a repo and commit files has never been easier--everything after that is gravy.
Last quarter, I recommended that teams coordinate their WordPress themes via GitHub, and gave a quick lecture on how to use it. The few teams that took me up on it considered it a good experience, and I had several others tell me they wish they'd used it (instead of manually versioning their work over DropBox). This quarter, I'm accepting homework via repo instead of portal sites, if students want--it'll make grading much easier, if nothing else. But these are stopgap, rearguard measures.
What SCCC needs, I think, is a class just on tooling, covering source control, preprocessing, and scripting. A class like this would serve as a stealth introduction to real-world developer workflows, from start (IDEs and test-driven development) to finish (deployment and build scripts). And it would be taught early enough (preferably concurrent with or right after HTML/CSS) that any of the programming courses could take it as a given. New developers would see a real boost in their value, and I honestly think that many experienced developers might also find it beneficial. Plus it'd be great training for me--I'm always looking for a good reason to dig deeper into my tools. Now I just have to convince the administration to give it a shot.
But let's pretend we weren't bound by the idea of "teach skills that are directly marketable"--i.e., let's pretend I'm not working for a community college (note: there's nothing wrong with teaching marketable skills, it's just a thought exercise). What's a good way to introduce people to the basic problems of programming, like looping and conditionals and syntax?
What about assembly?
Let's go ahead and get some reasons out of the way as to why you shouldn't teach programming with assembly. First, it offers no abstractions or standard libraries, so students won't be learning any immediately-useful skills for outside the classroom. It's architecture-specific, meaning they probably can't even take it from one computer to another. And of course, assembly is not friendly. It doesn't have nice, easy-to-type keywords like "print" and "echo" and "document.getElementById" (okay, so that's not all bad).
What you gain, given the right choice of architecture, is simplicity. Assembly does not give you syntax for loops, or for functions. It doesn't give you structures. You get some memory, a few registers to serve as variables, and some very basic control structures. It's like BASIC, but without all the user-friendliness. Students who have trouble keeping track of what their loops are doing, or how functions work, might respond better to the simple Turing tape-like flow of assembly.
But note that huge caveat: given the right choice of architecture. Ideally, you want a very short set of instructions, so students don't have to learn very much. That probably rules out x86. 6502 has a reasonably small set of opcodes, but they're organized in that hilarious table depending on what goes where and what addressing scheme you're in, which is kind of crazy. 68000 looks like a bigger version of 6502 to me. Chip-8 might work, but I don't really like the sprite system. I'd rather just have text output. What we need is an artificial VM designed for teaching--something that's minimal but fun to work with.
That's why I'm really interested in 0x10c, the space exploration game that's in development by Notch, the creator of Minecraft. The game will include an emulated CPU to run ship functions and other software, and it's programmed in a very simple, friendly form of assembly. There are a ton of people already writing tools for it, including virtual screen access--and the game's not even out yet. It's going to be a really great tool for teaching some people how computers work at a low level (in a simplified model, of course).