Mass Surveillance Was Blind. LLMs Gave It Eyes.

In 2013, Edward Snowden was asked to prepare a briefing about China's surveillance of its own citizens. Standard intelligence work. But while researching, a thought took root that wouldn't let go:

If it can be done, it probably is being done. And possibly already has been.

He went looking inside the NSA's own systems. He found everything.

Programs like PRISM, collecting stored communications from Microsoft, Google, Apple, Facebook - content and metadata. XKeyscore, running on over 700 servers across 150 sites worldwide, letting analysts search emails, chats, browsing history by name or keyword - without court approval. BOUNDLESS INFORMANT, logging 221 billion telephone and internet records per month. 1.7 billion intercepted communications per day.

Snowden described his own role bluntly: he helped enable the shift from targeted surveillance of individuals to mass surveillance of entire populations.

That was 2013. And here's the part most people don't realize: even then, the system was already choking on its own success.

They collected everything. They understood nothing.

The NSA's philosophy under Director Keith Alexander was simple: "Collect the whole haystack." Every call, every message, every click. Store it. Tag it. Search it later.

It worked - in theory. In practice, the volume was crushing. NSA analysts internally admitted: "If you target everything, there's no target." Snowden himself put it plainly: "When you collect it all, you understand nothing."

The problem wasn't collection. Collection was solved. The problem was comprehension.

Reading the actual content of billions of messages - understanding context, intent, meaning - required human analysts. And humans don't scale. So the system settled for the next best thing: metadata.

Who called whom. When. For how long. From where. Not the letter - the envelope.

This wasn't a principled decision. It was an engineering limitation. They didn't stop at metadata because they respected your privacy. They stopped because they couldn't process your content. The XKeyscore system could only retain content for 3 to 5 days before it had to be purged. Not for legal reasons. For storage and processing reasons.

Metadata was the compromise. Not the goal.

The bottleneck just disappeared

Large language models changed the equation entirely.

An LLM can read, summarize, classify, flag, and cross-reference millions of text conversations. Not in weeks. In hours. The cost is collapsing - an ACLU analysis found that processing 68,000 images through Google's Gemini costs $1.68. Streaming video analysis: about 10 cents per hour.

Text is cheaper. Messages are smaller. Emails, chats, voice-to-text transcripts - exactly the kind of unstructured content that used to drown analysts - are now trivially processable at scale.

Bruce Schneier, one of the most respected voices in security, drew the line clearly: the internet gave us mass surveillance. AI gives us mass spying. Surveillance is collecting data. Spying is understanding it. Before AI, spying couldn't scale. Now it can.

And Schneier doesn't mince words. In early 2026, he stated plainly that the US, China, and Russia are already doing this.

What changes when content becomes readable

Metadata tells you patterns. Content tells you thoughts. That's the difference, and it's everything.

Before AI, if you used code words in your messages - called a meeting point "the bakery," referred to a plan as "the recipe" - you were functionally invisible to automated systems. Keyword filters are stupid. They match strings, not meaning.

An LLM doesn't match strings. It understands context. It reads "the bakery" and infers from surrounding conversation that you're not talking about bread. Code words, slang, euphemisms - the entire toolkit people used to evade automated surveillance - become transparent to a system that understands language the way a human does. Except it never sleeps, never gets tired, and processes a million conversations while you're finishing one.

And it gets worse.

Imagine typing into a system: "What does Muhammed Bayram think about the policies of Party X?"

With enough collected communications - emails, messages, social media posts, search history - an LLM can construct a profile. Not a guess. A detailed, sourced, quoted summary of someone's political views, built from their own words. Researchers have already demonstrated political sentiment classification from social media posts with over 80% accuracy using basic models. Feed an LLM years of someone's private messages and the accuracy isn't 80%. It's forensic.

This isn't science fiction. This is a prompt.

The machine was always waiting for a brain

Here's what makes this moment different from every previous surveillance scare:

The infrastructure already exists. The collection systems are built and running. The legal frameworks - or the absence of them - are already in place. The fiber optic taps, the server access agreements, the metadata pipelines - all operational. Have been for over a decade.

What was missing was the ability to think about the data. To read it, interpret it, connect it. The pipes were full but the brain wasn't there.

Now the brain exists. And it's commercially available. You don't need a classified NSA program to run an LLM over text data. You need an API key.

The surveillance state isn't being built. It was built in 2013. It was just blind. Now it can see.

The new normal

China already has over 700 million surveillance cameras - roughly one for every two citizens. Their system can identify people by the way they walk. Multiple Chinese institutions are building AI systems that predict "social governance incidents" based on personality traits and emotional states. This isn't a secret. It's policy.

But the West isn't far behind. The US government reported over 2,100 active AI use cases across federal agencies in 2024. Project Maven uses AI for military intelligence. Palantir holds billions in contracts across defense and immigration enforcement. The NSA established an AI Security Center in 2023, calling itself the leader among intelligence agencies in deploying AI.

And that's just what's public.

Snowden's original fear was simple and devastating: if the capability exists, it will be used. Not because people are evil. Because institutions follow incentives, and the incentive to know everything about everyone is the oldest one in power.

The difference is that in 2013, "everything" meant metadata. Patterns. Connections.

In 2026, "everything" means your thoughts.

The punchline

The surveillance state was never a dystopian future. It was an engineering problem.

The collection was solved a decade ago. The storage was solved a decade ago. The legal immunity was solved a decade ago.

The only thing missing was comprehension.

That just got solved too.

Mass Surveillance Was Blind. LLMs Gave It Eyes.

They collected everything. They understood nothing.

The bottleneck just disappeared

What changes when content becomes readable

The machine was always waiting for a brain

The new normal

The punchline

Related posts

Code Is Cheap Now. That's the Point.

The CS Degree Is Dead. Long Live the CS Degree.

AI Will Never Reach “Human Level” - and the Reason Is Hidden in Our Definitions