this space intentionally left blank

April 4, 2024

Filed under: tech

Spam, Scam, Scale

I got my first cell phone roughly 20 years ago, a Nokia candybar with a color screen that rivaled the original GBA for illegibility. At the time, I was one of the last people my age I knew who had relied entirely on a landline. Even for someone like me, who resisted the tech as long as I could (I still didn't really text for years afterward), it was clear that this was a complete paradigm shift. You could call anyone from anywhere — well, as long as you were in one of the (mostly urban) coverage areas. It was like science fiction.

Today I almost never answer the phone if I can help it, since the only people who actually place voice calls to me are con artists looking to buy houses that I don't actually own, political cold-calls, or recorded messages in languages I don't speak. The waste of this infurates me: we built, as a civilization, a work of communication infrastructure that was completely mind-boggling, and then abandoned it to rot apart in only a few short years.

If you think that can't happen to the Internet — that it's not in danger of happening now — you need to think again. Shrimp Jesus is coming for us.

Welcome to the scam economy

According to a report from 404 Media, the hot social media trend is a scam based around a series of ludicrous computer-generated images, including the following subjects:

...AI-deformed women breastfeeding, tiny cows, celebrities with amputations that they do not have in real life, Jesus as a shrimp, Jesus as a collection of Fanta bottles, Jesus as sand sculpture, Jesus as a series of ramen noodles, Jesus as a shrimp mixed with Sprite bottles and ramen noodles, Jesus made of plastic bottles and posing with large-breasted AI-generated female soldiers, Jesus on a plane with AI-generated sexy flight attendants, giant golden Jesus being excavated from a river, golden helicopter Jesus, banana Jesus, coffee Jesus, goldfish Jesus, rice Jesus, any number of AI-generated female soldiers on a page called “Beautiful Military,” a page called Everything Skull, which is exactly what it sounds like, malnourished dogs, Indigenous identity pages, beautiful landscapes, flower arrangements, weird cakes, etc.

These "photos," bizarre as they may be, aren't just getting organic engagement from people who don't seem particularly discerning about their provenance or subject matter. They're also being boosted by Facebook's algorithmic feeds: if you comment on or react to one of these images, more are recommended to you. People who click on the link under the image are then sent to a content mill site full of fraudulent ads provided through Google's platform, meaning that at least two major tech companies are effectively complicit.

Shrimp Jesus is an obvious and deeply stupid scam, but it's also a completely predictable one. It's exactly what experts and bystanders said would happen as soon as generative tools started rolling out: people would start using it to run petty scams by producing mass amounts of garbage in order to trawl for the tiny percentage of people foolish enough to engage.

This was predictable precisely because we live in a scam economy now, and that fact is inextricable from the size and connectivity of the networked world. There's a fundamental difference between a con artist who has to target an individual over a sustained period of time and a spammer who can spray-and-pray millions of e-mails in the hopes that they find a few gullible marks. Spam has become the business model: venture capitalists strip-mine useful infrastructure (taxis and public transit, housing, electrical power grids, communication networks) with artificial cash infusions until the result is too big to fail.

Big Trouble

It's not particularly original to argue that modern capitalism eats itself, or that the VC obsession with growth distorts everything it touches. But there's an implicit assumption by a lot of people that it's the money that's the problem — that big networks and systems on their own are fine, or are actually good. I'm increasingly convinced that's wrong, and that in fact scale itself is the problem.

Dan Luu has a post on the "diseconomies of scale" where he makes a strong argument along the same lines, essentially stating that (counter to the conventional wisdom) big companies are worse than small companies at fighting abuse, for a variety of reasons:

  • At a certain size they automate anti-fraud efforts, and the automation is worse at it than humans are.
  • Moderation is expensive, and it's underfunded to maintain the profits expected from a multinational tech company.
  • The systems used by these companies are so big and complicated that they actually can't effectively debug their processes or fully understand how abuse is occurring.

The last is particularly notable in the context of Our Lord of Perpetual Crayfish, given that large language models and other forms of ML in use now are notoriously chaotic, opaque, unknowably complicated math equations.

As we've watched company after company this year, having reached either market saturation or some perceived level of user lock-in, pivot to exploitation (jacking up prices, reducing perks, shoveling in ads, or all three) you have to wonder: maybe it's not that these services are hosts for scams. Maybe at a certain size, a corporation is functionally indistinguishable from a scam.

The conventional wisdom for a long time, at least in the US, was that big companies were able to find efficiencies that smaller companies couldn't manage. But Luu's research seems to indicate that in software, that's not the case, and it's probably not true elsewhere. Instead, what a certain size actually does is hide externalities by putting distance — physical, emotional, and organizational — between people making decisions (both in management and at the consumer level) and the negative consequences.

Corporate AI is basically a speedrun of this process: it depends on vast repositories of structured training data, meaning that its own output will eventually poison it, like a prion disease from cannibalism. But the fear of endless AI-generated content is itself a scam: running something like ChatGPT isn't cheap or physically safe. It guzzles down vast quantities of water, power, and human misery (that AI "alignment" that people talk about so often is just sparkling sweatshop labor). It can still do a tremendous amount of harm while the investors are willing to burn cash on it, but in ways that are concrete and contemporary, not "paperclip optimizer" scaremongering.

What if we made scale illegal?

I know, that sounds completely deranged. But hear me out.

A few years ago, writer/cartoonist Ryan North said something that's stuck with me for a while:

Sometimes I feel like my most extreme belief is that if a website is too big to moderate, then it shouldn't be that big. If your website is SO BIG that you can't control it, then stop growing the website until you can.

A common throughline of Silicon Valley ideology is a kind of blinkered free speech libertarianism. Some of this is probably legitimately ideological, but I suspect much of it also comes from the fact that moderation is expensive to build out compared to technical systems, and thus almost all tech companies have automated it. This leads to the kind of sleight of hand that we see regularly from Facebook, which Erin Kissane noted in her series of posts on Myanmar. Facebook regularly states that their automated systems "detect more than 95% of the hate speech they remove." Kissane writes (emphasis in the original):

At a glance, this looks good. Ninety-five percent is a lot! But since we know from the disclosed material that based on internal estimates the takedown rates for hate speech are at or below 5%, what’s going on here?

Here’s what Meta is actually saying: Sure, they might identify and remove only a tiny fraction of dangerous and hateful speech on Facebook, but of that tiny fraction, their AI classifiers catch about 95–98% before users report it. That’s literally the whole game, here.

So…the most generous number from the disclosed memos has Meta removing 5% of hate speech on Facebook. That would mean that for every 2,000 hateful posts or comments, Meta removes about 100–95 automatically and 5 via user reports. In this example, 1,900 of the original 2,000 messages remain up and circulating. So based on the generous 5% removal rate, their AI systems nailed…4.75% of hate speech. That’s the level of performance they’re bragging about.

The claim that these companies are making is that automation is the only way to handle a service for millions or billions of users. But of course, the automation isn't handling it. For all intents and purposes, especially outside of OECD member nations, Facebook is basically unmoderated. That's why it got so big, not the other way around.

More knowledgeable people than me have written about the complicated debate over Section 230, the law that provides (again, in the US) a safe harbor for companies around user-generated content. I'm vaguely convinced that it would be a bad idea to repeal it entirely. But I think, as North argues, that a stronger argument is not to legislate the content directly, but to require companies to meet specific thresholds for human moderation (and while we're at it, to pay those moderators a premium wage). If you can't afford to have people in the loop to support your product, shut it down.

We probably can't make "being a big company" illegal. But we can prosecute for large-scale pollution and climate damage. We can regulate bait-and-switch pay models and worker exploitation. We can require companies to pay for moderation when they launch services in new markets. It can be more costly to run on a business model like advertising, which depends on lots of eyeballs, if there is stronger data privacy governance. We can't make scale illegal, but we could make it pay its actual bills, and that might be enough.

In the meantime, I'd just like to be able to answer my phone again.

Past - Present