this space intentionally left blank

June 15, 2009

Filed under: tech»web

A Surprising Depth

This weekend, the total number of posts on Twitter exceeded the possible range of a 32-bit number--

...hang on a second, and let's marvel at the sheer size of that. It is easy to forget that a 32-bit number is actually mind-bogglingly huge, since hey, there's only 32 bits. Remember, however, that bits are like the pennies on the chessboard in that old mind-teaser: each one doubles the size of the digit before it. So in binary, you may start out counting 1, 2, 4, 8... but by the time you hit your 16th bit the total is 65,536 and it just keeps doubling from there. 32 binary digits is enough to encode more than 4 billion: 4,294,967,295, to be exact. Even in the signed integer format Twitter uses, which reserves one bit for negative numbers, it's still more than 2 billion. That's a lot of work for only 32 bits.

Anyway! So the service passed the 2,147,483,649 mark this weekend with, hilariously, a post claiming ungrammatically that "The Tweets must flow" and linking to a lolcat--

...sorry, I have to make another digression here regarding the terminology in question. There's a widespread idea, encouraged by Twitter itself, that updates should be referred to as "tweets." This is, pardon my curmudgeonliness, really stupid. First of all, Twitter is nothing more than a microblog, and we already have a term for the basic units of a blog: we call them posts. Where Twitter primarily diverges from something like Blogspot is only in the speed and hothouse intensity of its ecosystem, and you can even see examples of this in people who write both long-form blogging and Twitter. The kind of person, for example, who writes polished, slightly-pretentious management advice on his or her blog will also tend to write polished, highly pretentious posts on Twitter. In my own case, it's more of a sidechannel for links and petty commentary, but the voice is not radically different.

For another thing: just because it's the Internet, you don't have to leave all your dignity at the door. "Tweets?" Really? Do you know how embarrassing that sounds, like when you're in the middle of an editorial meeting, and a bunch of middle-aged journalists start talking about the tweets they've written lately? Because I do, and it'll turn your hair white.

Right! Back on topic: Twitter passed the 32-bit overflow mark this weekend. Doing so is not just a landmark number, it's also a potential bug for Twitter clients coded to use only a 32-bit integer in their data structures. Of course, such clients are relatively rare: modern high-level programming languages often express their numbers in 64-bit formats (the range of which I find almost incomprehensibly vast). But older languages, such as C++ and Java, may default to 32-bit integers unless told otherwise--I remember learning them as "long" integers, which says a lot about the progress that we've made. Coders these days should know better than to use the int type for something like a Twitter post ID, but it's an easy mistake to make.

Sure enough, a few clients did not handle the overflow well. On the iPhone, Twitterrific began to crash for people. My favorite Android client, Twit2Go, also threw exceptions when it went to retrieve posts. I like Twit2Go quite a bit (it's fast and acts like a real Android app, with good long-press and menu-button behavior), so this was annoying.

But it was also an interesting study in distribution methods. The developer of Twit2Go started working on the problem on Saturday morning, soon after the problem surfaced. Five hours later, he uploaded the fixed version to the Android market, and it was immediately available for users. The developers of Twitterrific, IconFactory, were actually on the job even earlier: a Friday post on the company blog noted the bug, including details about a previous update that, unfortunately, had not caught all the errors. At 6PM on Saturday (almost exactly the same time as Twit2Go), IconFactory submitted an update to the App Store that fixed the remaining bugs. As of this moment, the free version has only made it through to end users this morning, and the paid version is still in approval limbo. Drama ensued.

The problem overwhelmingly faced by open platform advocates is that abstract dilemmas are hard to transfer into mainstream, non-geek experience. Try discussing DRM with the average person, or explaining the "shallow bugs" principle to them, and watch eyes glaze over faster than Krispy Kreme. But this is a great, easy-to-translate example of the walled garden problem: because control is centralized and accountability is non-existent, paying customers have been unable to use updated software for two days now, for no other reason than the arbitrary whims of a large corporation. No appeals, no alternatives.

To sum up: "binary numbers are very large, think twice before coining a neologism, and make sure you own your hardware." Not a bad range of topics for a service with a 140-character limit, but I suspect it lacks flavor by comparison. As usual, it's more interesting to write about Twitter than to write something interesting using it.

Future - Present - Past