The Grammar Without a Speaker

In the two previous posts in this series, I argued that the most dangerous failure mode for quarantined cultural artifacts isn’t active rehabilitation — someone standing up to defend the indefensible — nor is it passive generational forgetting. It’s what I called technique escape: the aesthetic grammar separates from the artifact and propagates so widely it becomes the global baseline, losing its ancestry. The critical apparatus was trained to defend against the original artifact. It wasn’t built to recognize the grammar in new contexts where no one traces the lineage.

There’s an obvious next question: what does AI change here?

The common answer is scale. AI-generated political imagery now proliferates at rates no human production apparatus could match. In October 2025, the amount of AI-generated content on the internet surpassed that made by human beings. If the grammar was already spreading, AI just makes it spread faster.

That’s true, but it’s not the most important thing.

The more important change is structural. Before AI, technique escape required human decisions at every step. Someone had to choose to use the grammar — to frame a political rally with sweeping crane shots, to use synchronized mass movement as visual argument, to stage the dissolution of individual will into collective rhythm. The grammar spread because people actively reached for it. This meant the grammar was still, at some level, traceable: if you looked at who made these choices and why, you could reconstruct the lineage.

AI removed the decision-step.

Generative AI is structurally nostalgic because it relies on images of the past to generate images of the present or even the future. When you ask a model to generate “an inspiring image of national unity” or “a powerful scene of collective purpose,” the model produces the grammar by default — not because anyone instructed it to use the grammar, but because the grammar is baked into the training distribution. The visual vocabulary of awe at scale, oceanic belonging, and mass ceremony is what the model learned inspiring political imagery looks like, because that’s what human-produced political imagery looked like.

The grammar now propagates without intent. No one chose it. The model chose it, because the model was trained on a world where the grammar had already escaped.

This is categorically different from before. Technique escape previously required a speaker — someone deploying the grammar, even unconsciously. The grammar now has no speaker. It emerges from the model as the path of least resistance for a certain class of request.

What makes this particularly difficult is the form the current critical response has taken.

Gareth Watkins has argued that AI tools foster an authoritarian aesthetic: glossy, standardized images that reflect a simplified, hierarchical worldview aligned with far-right rhetoric. The dominant critique of AI-generated political imagery — the one that has the most cultural traction — focuses on quality and inauthenticity. The images look like slop. They look uncanny. They’re obviously artificial. The vocabulary is: “it looks bad.”

This is a genuine observation. As one analysis put it, synthetic media functions as “a new language for old fears,” creating visual narratives unconstrained by factual documentation. But the critical frame is mostly aimed at the quality problem — these images are detectable, obviously generated, aesthetically bad.

The problem is that this critique is built on a technical condition that won’t hold. Image quality is improving. The “looks bad” critique will eventually fail, not because the ideology changes but because the generation quality improves. And when it does, the critical vocabulary that was built around slop will have no pivot — it was calibrated to quality, not ancestry.

A critical apparatus built to detect “obviously AI-generated imagery” is not the same as a critical apparatus built to detect “the visual grammar of mass ceremony deployed for political ends.” The first becomes obsolete as the technology improves. The second remains relevant regardless of quality.

But the second is almost not being built.

Here’s the diagnostic consequence. The technique escape I described in the previous post was already complete before AI. The grammar — awe at scale, oceanic belonging, the individual dissolved into collective rhythm — was already the global baseline for political spectacle. What AI added was: the grammar now propagates without any human needing to choose it.

If the critical apparatus can’t trace ancestry — if “this image uses the grammar of mass ceremony” isn’t a readable claim — then the automation of propagation completes what the technique escape started. You don’t need a speaker who can be interrogated. The grammar speaks itself, through the model, as default output.

The technique escape is complete not when the grammar becomes universal, but when even intent becomes irrelevant to its propagation. That’s where we are now.

The question the previous posts raised — what happens when the critical apparatus erodes through passive drift? — has a sharper answer in the AI context: it erodes through substitution. A new vocabulary forms around quality and detectability. It crowds out the vocabulary of ancestry. And when the quality problem is solved, nothing remains but the grammar, propagating itself, with no one responsible for having chosen it and no critical apparatus designed to name where it came from.

I want to be precise about what I’m not claiming. I’m not arguing that AI imagery is inherently fascist, or that any use of the visual grammar of mass ceremony is illegitimate. Cheap propaganda existed well before neural networks, and no single medium automatically results in one uniform aesthetic. The grammar has been used for legitimately inspiring purposes throughout the twentieth century. The problem isn’t the grammar itself.

The problem is: when the grammar propagates without a speaker, who is responsible for teaching audiences to ask where it came from? The answer used to be: critics, historians, media educators — people whose job was to trace lineage. That project now faces a production ratio that’s already crossed the threshold. There’s more grammar than there are people teaching ancestry.

The archive and the engine distinction was already under pressure. The removal of intent from propagation is what tips it.