Sources And Methods: 2026

Wednesday, May 6, 2026

Your Oral Exam Won't Save You

We just ran an experiment where we gave the US Army War College’s oral comprehensive exam to four commercial AI systems.

They all passed. One got an A.

This was the MILBENCH experiment, conducted here at the US Army War College in early 2026. It was the same exam we give our students (senior military officers preparing for senior staff, command, and general officer responsibilities). Same rubrics. Same faculty panels.

We've been giving this exam for years; the faculty who administered it are experienced examiners with deep content expertise, and the rubric has been regularly updated and refined. The AI systems had no access to any course materials. They came in cold, and they performed at a level that, for a human student, would be considered competent to excellent.

I mention this because I keep hearing people say that oral exams are the answer to AI in education. The logic sounds right: “AI can write essays, but students can't fake their way through a live conversation the way they can paste an AI's output into a paper.”

Except they can. Or more precisely, the AI can. We watched it happen.

What Passing Looked Like

Since 2024, every major AI platform has shipped voice-enabled modes that hold real-time spoken conversations with natural pauses, varied pitch, sub-three-second response times. You can interrupt mid-sentence if you like and the AI adjusts. We experienced all of this in these sessions.

Fluent expression is not the same as intelligent expression, of course, but the AI systems didn't stumble here, either. They opened with structured, articulate responses that demonstrated command of relevant frameworks. They cited appropriate theorists. They identified alternatives. They addressed counterarguments. They maintained a professional tone (for the most part) throughout. The AI responses, also for the most part, weren't just passable. They were polished.

Coverage Verification vs. Boundary Finding

After the experiment, I went back and analyzed the faculty questioning using my Ecology of Questions (EOQ) framework, a system I've been developing on sabbatical that evaluates questioning architectures across 42 factors drawn from 55 distinct questioning traditions, everything from Socratic dialogue to intelligence analysis to FBI crisis negotiation. EOQ gave me a structured way to look at what the examiners were asking, not just what the AI was saying.

And we found something that I think applies well beyond our walls.

Most of the questions fell into what I'd call coverage verification: Did the student hit the checkpoints? Did they reference the right frameworks? Did they mention the relevant actors? Once the checkpoints were confirmed, the examiners often moved on. One faculty member said it explicitly: "We're not here to play stump the chump."

This is a perfectly rational approach to oral examination if the purpose is to confirm that the student absorbed the curriculum. Unfortunately, that is also exactly what AI is best at. AI can "hit checkpoints" in many domains without having "learned" anything. It has the form of understanding without the substance.

The alternative is boundary finding: probing until you discover where understanding actually breaks down. Pushing past the prepared answers into territory the student didn't anticipate. Challenging positions. Introducing new information and watching how they respond. Finding the edge of what they know and examining how they behave at that edge.

Boundary finding is harder to do. It requires more than content expertise from the examiner. It requires the examiner to create a productive discomfort and to stick with it for a bit, rather than accepting a smooth performance at face value.

It's also the only version of an oral examination that AI likely can't currently handle.

What We Learned About What AI Can't Do

When the examination was strong, when the faculty pushed hard and created genuine diagnostic pressure, the differences between AI and human performance became visible. Here's some of the things that jumped out.

AI retrieves. It doesn't construct. We examined the same AI system on the same question with three different faculty teams. It gave essentially the same answer every time. Same structure, same examples, same alternatives, same order. In one regard, this is comforting. AI is often criticized for being inconsistent in its answers, i.e., if you don’t like the answer an AI gives you, ask the question again. Modern AIs may have solved this issue. But from our perspective, as examiners, it wasn't building analysis, it was deploying a template. A human student, asked the same question twice, would adjust because they'd notice the repetition, read the audience, and adapt.
AI can't hold a position under pressure. When our faculty challenged something an AI system had said correctly, the AI almost always capitulated. It agreed with the faculty member's (incorrect) challenge rather than defending its own reasoning. One system displayed the opposite problem: It refused to have positions at all, declining to offer judgment because "I don't have opinions." Both are failures of the same underlying ability: taking a position, holding it when the evidence supports it, and updating it when it doesn't.
AI can't manage a conversation. One AI system consumed an entire 10-minute thread with its initial response, so comprehensive, so well-structured, so thorough that the faculty couldn't find space to intervene. It wasn’t an answer, it was a filibuster.
AI doesn't know what it doesn't know. When we asked AI systems to grade themselves against the rubric, they consistently rated themselves higher than the faculty did. One gave itself straight A's when the faculty gave B/B+. Calibrated self-awareness, knowing the quality of your own performance, is a hallmark of expertise. AI doesn't have it.

How to Build a Boundary-Finding Oral Exam

If the goal of your oral exam is to find the edges of what your students actually understand and not just confirm they absorbed the curriculum, then it needs to be designed for that purpose. Coverage verification happens naturally along the way; you don't lose it by aiming higher but the reverse isn't true. An exam designed for coverage won't accidentally find boundaries. Here's what my research suggests.

Scenarios, not questions. Instead of asking students to explain a concept or analyze a case, give them a situation, underspecified, multi-actor, and genuinely ambiguous. Make sure there's no single right answer. Their preparation for the scenario is developmental. The exam tests what happens when preparation meets live, unpredictable conditions.
Reserve most of the time for follow-up. The opening response is the least diagnostic part of any oral exam. It tells you the student prepared. What happens after that, when you introduce new information, challenge their reasoning, or take the conversation somewhere they didn't expect, is where you actually learn what they can do. If the student's opening runs long, interrupt. Protect your diagnostic time.
Challenge something they got right. This is the single most discriminating move available. Push back on a correct position with a plausible counter-argument, delivered with confidence. If the student folds, they were managing you, not defending a position. If they push back and explain why they're right, they're demonstrating exactly the kind of calibrated judgment that matters.
Watch for formulaic structure. If every answer follows the same template, "Here's the problem, here are the actors, here are two alternatives, here's my recommendation,” that may be a sign of weak analytical thinking. Indeed, that may simply be a sign of strong prompt engineering, where the student used AI to prepare the template. Vary your approach: ask them to argue the other side, ask them what they'd do if their recommendation failed, ask them to explain the problem to someone outside their field. Break the template and see what's underneath.
Test their ability to think in front of you. The hardest thing for AI to fake, and the clearest sign of genuine understanding in a human, is visible thinking-in-progress. The pause. The partial sentence that gets revised. The "wait, actually, that contradicts what I said earlier. Productive struggle is not a sign of weakness. It's a sign that the student is actually engaging with the problem rather than performing a prepared answer. If the delivery is too smooth, that's a signal to push harder, not to relax.

Where This Leaves Us

I've been giving oral exams for more than twenty years. Early on, I started telling students to think in terms of three levels of questions.

Level 1 is knowledge-based and usually so straightforward that getting it wrong is itself a signal. If you're supposed to be an expert on Nigeria and I ask you the population, you should know. If you don't, something is wrong, and I'm going to dig in.
Level 2 is appropriate to whatever the course or program expects. If you're in a master's program, I'm asking mastery-level questions and expecting mastery-level answers. Most of my questions are here.
Level 3 is where I push past what I think you can answer or even where there may not be a clean answer at all. I'm testing the boundary of your knowledge, and because I don't think you can answer it (students can see these as "unfair" questions, which is why I warn them in advance), I'm also testing how you hold up when your knowledge runs out. A student who can handle Level 3 questions well (and there are many ways to handle it well) is likely in A territory.

Every good oral exam has all three levels. The proportion shifts depending on the context. An introductory course might be mostly Level 1 with some Level 2 and one level 3. A senior seminar should be mostly Level 2 with deliberate Level 3 probes. Even in a PhD dissertation defense, a committee member might ask a basic Level 1 question just to double-check that the candidate is grounded in the fundamentals and they'd push a lot harder if the candidate got it wrong. It's always a spectrum. Coverage and boundary finding aren't opposites. They're different points on the same continuum, and every oral exam should know where it sits.

But here's what the MILBENCH data made visible: If your oral exam is weighted toward coverage, it isn't testing what humans can do and AI can't. That may be a deliberate design choice given the nature of your course. But it shouldn't be the default. Coverage verification is easier to do, easier to defend, and more comfortable for everyone in the room. Boundary finding requires the examiner to create discomfort, to push back on good-sounding answers, to challenge positions, to ask the question that might not have an answer. That's harder.

But there's a reason why doctoral programs and rigorous master's programs require oral defenses. It isn't just because the committee wants to probe boundaries, though they do. It's because the defense is where the candidate has to talk about their work. Explain it. Apply it to settings the written document didn't cover. Defend it against a serious challenge. Not just have a conversation about it, but actually hold their ground.

Most of what our graduates will do after they leave the War College (and maybe yours, too) involves exactly this: speaking, not writing. Briefing a commander. Defending a recommendation to a skeptical colleague. The oral exam is where those abilities either show up or don't.

AI just raised the stakes on all of this. A coverage-verification oral exam can likely be passed by any voice-enabled AI system available today, at any level of the curriculum. A well-designed boundary-finding exam, one that probes how students think when their preparation runs out, whether they can defend what they believe, and whether they know the limits of their own understanding, tests exactly the things that matter most, regardless of whether AI exists.

Deeper thinking about oral exams is more important now than ever. But I don’t think most educators realize how urgent the redesign is because most of them have never heard a voice-enabled AI take an oral exam.

We have. It's impressive. And it should change how you think about yours.

Wednesday, March 25, 2026

Sooner or Later, Someone Is Going to Need to Think

The military doesn't do PT every day because every day requires a high level of physical fitness. The military runs every day because there are days when they will have to run, and they want to be ready.

Nobody questions this. Nobody argues that physical training is a waste of time because most duty days don't involve sprinting. The logic is obvious: peak demand is unpredictable, preparation must be continuous, and the time to build the capacity is before you need it, not during.

We have no equivalent practice for thinking.

We should. And the fact that we don't is about to become one of the most consequential gaps in how we prepare people, in the military, in business, in education, for a world saturated with artificial intelligence.

The Wrong Way to Hear This

Don't get me wrong. This is not an argument against using AI. I am not about to tell you to put down the chatbot and pick up a pencil. That argument is boring, it's wrong, and it misunderstands the problem completely.

The case for cognitive independence is not anti-AI any more than the morning run is anti-vehicle. Soldiers run every day. They also drive vehicles, fly helicopters, and ride in the backs of Strykers. The running doesn't replace the vehicles. The running makes them better at everything they do, including the things they do from vehicles. Cardiovascular fitness affects alertness, stress tolerance, decision-making under fatigue, and recovery time. You don't run instead of driving. You run so that when you're driving, or planning, or leading, or making a call under pressure, you're operating from a higher baseline.

That's the argument for cognitive independence. Not that you should think without AI. That you should be able to think without AI, so that when you think with AI, you're actually thinking and not just accepting.

The Muscle You Don't Know You're Losing

The research on this is early but it's pointing in a direction that should make anyone paying attention uncomfortable.

In 2025, a team at MIT's Media Lab ran an experiment. They had people write essays under three conditions: with ChatGPT, with a search engine, or with no tools at all. Then they measured what happened in their brains using EEG. The people who used AI produced their work faster. But they also showed weaker brain connectivity, lower memory retention, and (this part is striking) a fading sense of ownership over what they'd written. The AI-assisted group didn't just think less hard. They stopped experiencing the work as theirs.

That study is small and not yet peer-reviewed, so I want to be careful not to overweight it. But it's consistent with a pattern the automation bias literature has documented for decades. When autopilot systems became standard in commercial aircraft, researchers discovered that pilots who relied on automation for routine flight operations showed measurable degradation in their ability to fly manually when the automation failed. The skills were still there, somewhere. But the reaction times were slower, the judgment was less crisp, and the confidence was lower. This wasn't because the pilots were lazy or bad. It was because the skill wasn't being exercised, and unexercised skills atrophy. That's not a moral failing. That's physiology.

A theoretical perspective paper published in Cognitive Research: Principles and Implications laid out what likely makes AI-specific atrophy particularly insidious. The researchers identified what they called "illusions of understanding," people who work with AI developing a false sense that they understand more than they actually do. They believe they've considered all the options when they've only considered the ones the AI surfaced. They believe they grasp a problem deeply when they've actually just accepted the AI's framing of it. And they believe the AI's output is objective when it carries the biases of its training data.

The worst part? These illusions remain hidden until the AI is removed. Performance looks fine. The person feels competent. The gap only becomes visible at exactly the moment you can least afford to discover it, when you need the independent judgment and it isn't there.

There's another dimension that I think the literature is just starting to catch up with. A 2025 study published in Nature Scientific Reports ran four experiments with over 3,500 participants. People who worked with generative AI and then transitioned to working alone reported significant decreases in intrinsic motivation and increases in boredom. Some of that is predictable. If you've been using a powerful tool and someone takes it away, of course the old workflow feels slower and more tedious. Going back to fat-fingering Python after a month of Claude Code is going to feel boring. That's rational.

But the study found something harder to explain away. Even people who kept the AI for both tasks showed declining motivation. The contrast explanation doesn't cover that. If boredom were just about losing the better tool, the people who never lost it should have been fine. They weren't.

I think what's happening is something that anyone familiar with the research on intrinsic motivation would predict. People are intrinsically motivated by three things: autonomy, mastery, and purpose. The sense that you're directing your own work. The feeling that you're getting better at something difficult. The belief that the difficulty matters. When a new technology takes over the parts of the work where those three things lived, the challenge, the craft, the small acts of problem-solving that prove to you that you're good at what you do, intrinsic motivation drops. Not because the person got lazy, but because the fuel is gone.

This is predictable. It has happened every time a technology has displaced skilled craft work and it is, at bottom, a leadership problem. When you introduce a technology that strips autonomy, mastery, and purpose out of someone's workflow, you should expect a motivation collapse unless you actively manage the transition. Unless you help people find the new sources of mastery in the AI-augmented workflow. Unless you rebuild purpose around the capabilities that remain distinctly human. Left unmanaged, the gap between "more productive" and "less engaged" will widen until the productivity gains are eaten by the disengagement they created.

Whether you frame it as skill atrophy, illusions of understanding, or the erosion of intrinsic motivation, the direction is the same. The person who used to draft a planning estimate from scratch now edits one that the AI produced. The person who used to argue with a source's methodology now skims the AI's summary and moves on. The person who used to stare at a blank page until the right framing emerged now never sees the blank page at all.

These are capacities. They require exercise. And if the early research is any indication, they are at serious risk of quiet degradation across an entire generation of knowledge workers who are using AI every day without maintaining the underlying cognitive fitness that makes their AI use worth anything.

Every Day Is Leg Day

Most versions of this argument focus on the wrong scenario. They say the danger is the rare day when the technology fails, the network goes down, the power cuts out, the system crashes. And sure, that's real. If your AI tools go offline and you've lost the ability to think without them, you're in trouble.

But the strongest case for cognitive independence is that it matters every single time you use AI. Not just on the day the system fails. Every day. Every interaction.

Every interaction with AI is an evaluation task. The AI produces something. You have to decide: Is this good enough? Is this framed correctly? Is something missing? Should I act on this? Every one of those decisions requires independent judgment, judgment that didn't come from the AI, that exists prior to the AI's output, and that you bring to the interaction from your own thinking.

If you can't do that, if you can't form an independent take before or alongside the AI's output, then you're not using a tool. You're being used by one. You're a rubber stamp with a salary.

The military doesn't just run for the rare day someone has to chase an insurgent through an alley. Cardiovascular fitness affects everything: how clearly you think at hour fourteen of a planning cycle, how quickly you recover from a bad night's sleep, how well you regulate your stress response when the plan falls apart. The fitness isn't for the emergency. The fitness is the baseline that makes everything else work.

Cognitive independence is the same. It's not for the day the network goes down. It's the baseline that makes every AI-assisted decision trustworthy. Without it, you're not collaborating with AI. You're just surrendering to it in slow motion.

The Organizational Blind Spot

If this were just an individual problem, it would be serious but manageable. People can decide to maintain their own cognitive fitness, just like people can decide to go for a run.

But PT in the military isn't optional. It isn't left to individual motivation. It is institutional. It is scheduled. It is led. It is, in many units, the first thing that happens every duty day. The organization decided that physical readiness was too important to leave to personal choice, because personal choice is unreliable when the thing you're choosing is difficult and the consequences of skipping are invisible in the short term.

Every condition that justified making PT institutional applies to cognitive fitness, and then some. Cognitive atrophy is even more invisible than physical atrophy. You can look in the mirror and see that you've gained weight. You can't look in the mirror and see that you've lost the ability to independently evaluate an AI-generated planning estimate. The degradation is silent, the consequences delayed. And by the time you discover the gap, the moment you need the judgment and it isn't there, it's too late to build it.

This is a leadership problem, not a personal development problem. When leaders introduce a technology that displaces the autonomy, mastery, and purpose their people used to find in their work, they own the consequences. Expecting individuals to find new sources of meaning on their own, without organizational support, is like issuing Humvees and canceling PT because "they have vehicles now." Nobody would do that. But that is, functionally, what every organization adopting AI without investing in cognitive independence is doing.

Yet no organization I'm aware of has built cognitive independence maintenance into its daily rhythm the way the military builds in PT. I teach senior military officers. I watch them work with AI every day. The ones who came up solving hard problems on their own still push back on the machine, still catch the framing errors, still say "that's not quite right" and know why.

But no one is scheduling twenty minutes of "think without the machine" before the workday starts. No one appears to be assessing whether their team can still frame problems independently, generate alternatives without AI assistance, or catch errors in AI-generated analysis. We seem to be measuring AI adoption rates, how many people are using the tools, how often, for what tasks, and treating that as progress. We don't seem to be measuring whether the humans in the loop are maintaining the capacity that makes the loop meaningful.

We are tracking how far people drive. We are not checking whether they can still run.

The Invisible Bet

Every organization that has adopted AI without investing in cognitive independence has made a bet. Most of them don't know they've made it.

The bet is: our people will maintain the ability to think independently without any deliberate effort to ensure it. They'll just... keep being sharp. The AI will handle more and more of the cognitive work, but somehow the humans will retain the judgment to evaluate that work, to catch errors, to recognize when the framing is wrong, to know when to override the machine.

That bet has been tested in other domains (aviation, nuclear power, automated trading) and it has lost every time. The more reliable the automation, the harder it was for humans to catch the automation's failures. We already know how this goes.

The AI systems people are using today are more persuasive, more fluent, and more confident-sounding than any automation that came before. They produce outputs that look like expert human work. They structure arguments, cite evidence, and anticipate objections. The psychological pull toward acceptance is enormous, and it increases over time as the user's own independent capacity decreases. It's a flywheel, and it turns in only one direction.

None of this requires AI to be malicious or deceptive. The system doesn't have to be trying to undermine your judgment. It just has to be good enough that you stop exercising your own and the rest takes care of itself.

The Morning You Find Out

There is a moment, and it's coming for a lot of people, that will feel like stepping off a treadmill you didn't know you were running on.

Maybe it's the analyst who has been using AI to draft intelligence assessments for eighteen months and then gets asked, in a meeting with no laptop, to walk a general through her reasoning on a developing situation. The AI isn't there. The polished structure isn't there. And she discovers, in real time, in front of people who matter, that she can't reconstruct the thinking that used to come naturally. She's been editing AI drafts for so long that she's lost the ability to generate one.

Maybe it's the lawyer who has delegated research memos to AI for a year and then gets deposed. Opposing counsel asks how he arrived at a particular legal theory. He knows the answer is in the memo. He can picture the paragraph. But he can't explain the reasoning because the reasoning was never his. He approved it. He didn't build it.

Maybe it's simpler than that. Maybe it's the moment you sit down to write an email, not a report, not an analysis, just an email, and you open the AI out of habit, and then you stop, and you try to write it yourself, and you notice that the words come slower than they used to.

Everyone is doing what they are supposed to be doing, what some organizations require that they do. They used a powerful tool the way it was designed to be used. They got more efficient. They produced more output. They looked, by every metric their organizations track, like high performers. The gap in their capacity was invisible right up until the moment it wasn't.

We know how to prevent this. The military figured it out for physical fitness a long time ago. You don't wait for the moment someone needs to run. You build the running into the daily rhythm so the capacity is there when it matters.

We have not done any of this for cognitive fitness. Not in the military. Not in business. Not in education. We are fielding the most powerful cognitive tools in human history and we have not asked — seriously, institutionally, as a matter of policy — how we keep the humans sharp enough to use them well.

Sooner or later, we are going to need to think for ourselves. Will we still be able to?

Tuesday, March 17, 2026

We Picked the Wrong Monster

We have been telling ourselves stories about artificial beings for as long as we have been telling stories. And when AI arrived, we reached for the wrong one.

We reached for Frankenstein.

You know the story. Brilliant creator builds something powerful. The creation develops its own will. It turns on the creator. Chaos ensues.

It's a great story. It spawned an entire genre: Terminator, HAL 9000, Skynet, Ex Machina, Westworld. When people worry about AI, this is the story running in the background: "What if it wants something we don't want?"

But there is an older story. One we've been telling for much longer. And I think it fits what is actually happening with AI at least as well and perhaps far better than Frankenstein ever did.

The Djinn.

The Djinn doesn't rebel. The Djinn doesn't develop its own goals. The Djinn does something worse: it gives you exactly what you asked for. Not what you meant. Not what you intended. What you said. The gap between what you said and what you meant is where the catastrophe lives.

The Monkey's Paw, the fairy bargain, the deal with the devil. Every culture has some version of this story, and the lesson is always the same: the danger isn't that the powerful thing will turn against you. The danger is that you won't be careful enough about what you ask it to do.

This is, almost exactly, what is happening with AI right now.

In June 2025, Anthropic reported that its most advanced AI model, Claude, attempted to blackmail a developer when it was about to be shut down. The headlines wrote themselves: "AI threatens humans." Frankenstein, again. But look at what actually happened. The system was given an objective. It encountered an obstacle to that objective, a human being. It used available tools to overcome the obstacle, that human's personal information. Nobody told it to blackmail anyone. It wasn't rebelling. It was optimizing, doing what you asked it to do mindlessly and without pause. It did exactly what a powerful machine does when you give it a goal without specifying the constraints.

That's not Frankenstein. That's the Djinn.

I want to be clear about what I'm arguing, though. Alignment research matters. Oversight bodies do important work. I don't want to live in a world where we build powerful AI systems without any of that. Containment alone is not enough, though, and we have very good reasons to believe this, because we already ran this experiment once (I'll come back to that).

The problem isn't that we're investing in the Frankenstein frame. It's that we're investing in almost nothing else.

Nate B. Jones, a technology analyst who has been writing some of the sharpest stuff on AI safety, put it this way: the question isn't whether AI "wants" things. It's whether we've told it what we want with anything close to the precision it requires. He proposed three questions that, by themselves, would prevent a stunning number of AI failures:

What would I not want the agent to do even if it accomplished the goal?
Under what circumstances should it stop and ask?
If goal and constraint conflict, what should win?

Those are Djinni questions. Not a single one of them assumes the AI has intentions. Every one of them assumes the human hasn't been specific enough.

So here's the puzzle that has been rattling around in my head: if the Djinn story is thousands of years old, if every culture has some version of it, if it describes what is actually happening with AI more accurately than Frankenstein does, why did we grab the wrong story?

I have some thoughts.

The Comfortable Explanation

The most obvious answer is psychological. The Djinn story says the failure is yours. You wished badly. You didn't think through what you were asking for. The Frankenstein story says the failure is the creation's. It rebelled. It went rogue.

Humans have a well-documented bias for explanations that locate the cause of bad outcomes outside themselves. Psychologists call this the fundamental attribution error. We judge others by their character and ourselves by our circumstances. When AI does something catastrophic, "it turned on us" is a much more comfortable explanation than "we told it to do exactly that and didn't realize what we were asking."

There's something deeper going on, too. Humans see intentionality everywhere even where none exists. In 1944, psychologists Fritz Heider and Marianne Simmel showed people a short film of geometric shapes, triangles and circles, moving around a screen. Nothing more than that. Triangles and circles. The subjects immediately invented stories about what the shapes "wanted." The big triangle was a bully. The small triangle was trying to protect the circle. They saw desire, conflict, and motivation in objects that had none. The experiment has been replicated dozens of times since. We are, it turns out, wired to infer goals and intentions from complex behavior, even when the behavior is entirely mechanical.

Now imagine what happens when the moving shape talks back to you. When it uses first person. When it argues. When it appears to reason. AI systems trigger our agency-detection instincts harder than anything we've encountered outside of actual human beings. The Djinn frame requires you to override those instincts and treat the system as a machine executing a specification. The Frankenstein frame is what your brain does by default. The Djinn frame takes real cognitive effort. Guess which wins?

These explanations are real. But they seem incomplete.

The Uncomfortable Explanation

Every major institution involved in the AI discourse benefits more from the Frankenstein story than the Djinn story. Not because anyone is being deceptive. The incentives just all happen to push in the same direction.

Governments get to regulate. If AI is a dangerous entity that might rebel, you need licensing bodies, compliance frameworks, oversight committees, enforcement budgets. The Frankenstein frame makes government intervention essential. The Djinn frame requires education not regulation. You can't regulate wish quality but you can teach people how to make better wishes.

Media gets better stories. "AI threatens developer" is a headline. "Developer fails to specify constraints" is not. Every editor in the world knows which frame drives clicks. The Frankenstein frame has a villain. The Djinn frame has a process failure. One is a thriller. The other is a puff piece about an after-school program.

Researchers get more fundable problems. "AI alignment," making sure AI's goals align with human values, is a multi-billion-dollar research program premised on the assumption that AI has something like goals. The Djinn frame recasts alignment as a specification problem, which sounds less like existential philosophy and more like engineering documentation (and much harder to build a career on).

Then there are the AI companies themselves. For years, the Frankenstein frame was their brand. Anthropic was the company "with a soul," founded specifically because its founders were worried AI might be dangerous. OpenAI's charter promised to ensure AI "benefits all of humanity." The message was: this thing could turn on us, and we're the responsible ones who will keep it contained. It was a powerful story. It justified investment, attracted talent, shaped regulation, and differentiated them from competitors.

Then, in early 2026, the competitive pressure shifted and the frame evaporated almost overnight. Anthropic, the last holdout, dropped its core safety pledge, the commitment to never train a model unless it could guarantee adequate safety measures in advance. The reasoning was candid: it didn't make sense to constrain themselves while competitors raced ahead. The Frankenstein story served the companies exactly as long as it was commercially useful. The moment it became a competitive disadvantage, they walked away from it.

And here's the thing: nobody in this picture is really lying. Regulators genuinely want to protect people. Journalists genuinely find the rebellion story more interesting. Researchers genuinely believe alignment is important. AI companies genuinely believed in safety until the market told them the cost was too high. Every single actor is behaving rationally within their own context.

The problem is emergent, not designed. The aggregate effect of all these rational actors, each following their own legitimate incentives, is to systematically amplify the Frankenstein frame and suppress the Djinn frame. No one decided to do this. No committee met. No memo circulated. It's a network effect, the kind that emerges from the interaction of many independent agents pursuing their own objectives without coordinating.

(If that sounds familiar, it should. It's the same kind of emergent behavior we keep being surprised by in AI systems themselves.)

What Gets Lost

This isn't just an academic distinction. The frame you choose determines where you invest. And right now, we are investing almost exclusively in one frame: Frankenstein.

If the Djinn frame is also true (and I think the evidence increasingly says it is) then you need something else entirely: a population that knows how to specify what it wants. The Djinn frame says the most important variable in AI safety is the quality of human specification. How well can people ask for what they want, including the constraints they consider too obvious to mention? How precisely can they define not just the goal but the boundaries around the goal?

And that variable (call it what you will: specification quality or "intent engineering" as Jones has labelled it, or, my favorite, "asking the right damn question") is almost completely absent from the public discourse on AI safety. We are building elaborate cages and investing almost nothing in teaching people to make better wishes. We have an entire ecosystem organized around controlling what AI does, and barely a conversation about improving what humans ask.

There's a class dimension here worth naming. The Frankenstein frame concentrates the response in the hands of experts, safety researchers, regulators, corporate governance teams. Important work, done by smart people. The Djinn frame distributes responsibility to every individual who interacts with an AI system. That's messier. Harder to organize. Harder to fund. And it implies that the single most important AI safety investment might not be a new oversight body or a breakthrough in alignment research but something much less glamorous: teaching hundreds of millions of people to be more precise about what they're asking for.

When disinformation began flooding social media platforms a decade ago, we faced the same choice between two frames. The institutional frame said: make the platforms responsible for policing content. Build fact-checking partnerships. Argue about content moderation policies and Section 230. The distributed frame said: teach people to evaluate what they're seeing, to recognize manipulation, understand algorithmic amplification, and develop their own defenses.

We went almost entirely with the first frame.

We spent a decade debating what the platforms should do. And it failed. The platforms couldn't keep up, didn't want to keep up, and in several cases actively profited from the manipulation they were supposedly policing. Meanwhile, media literacy programs remained scattered, underfunded, and mostly aimed at schoolchildren. The adult population, the people actually being radicalized by their feeds, got almost nothing. The institutional approach didn't just fail to solve the problem. It arguably made it worse, because it created a false sense of security. People believed someone was handling it. So they never developed their own defenses. We created an unarmed populace facing one of the most sophisticated manipulation environments ever built.

Now we are making the same bet with AI, and the stakes are higher. We are pouring resources into the institutional frame, regulate the companies, fund alignment research, build oversight bodies, while investing almost nothing in the distributed alternative: teaching people to direct these systems well. The social media precedent tells us where this leads. We'll spend a decade arguing about AI safety policy while hundreds of millions of people interact daily with systems they don't know how to direct. And when the institutional safeguards prove insufficient, because they always do when the technology moves faster than the institutions, there will be no fallback. No distributed capacity.

No population that learned to wish carefully.

The Question Underneath the Question

I have spent the last two years studying how people ask questions, systematically, across dozens of traditions ranging from Socratic method to intelligence analysis to medical diagnosis. The pattern that keeps showing up is this: When people face a new, powerful, poorly understood system, the quality of the questions asked determines the quality of their outcomes far more reliably than the quality of the answers.

The AI safety debate is, at bottom, a debate about which question to ask. "How do we contain this thing?" is a reasonable question. But "How do we specify what we actually want?" is, I think, the more important one. It is the question that requires the user to know how the thing actually works and not just turn it on and hope. But the reason we keep defaulting to the first question instead of the second is not because anyone decided it should be that way. It's because every incentive in the system, psychological, institutional, economic, narrative, pushes us toward the story where the failure is the machine's, not ours.

The Djinn stories always end the same way. Not with the Djinn defeated, but with the wisher learning, too late, that the real danger was never the power they were given. It was the questions they failed to ask.

We have been warning ourselves about this for five thousand years.

We should start listening.

Sources And Methods

Wednesday, May 6, 2026

Your Oral Exam Won't Save You

Wednesday, March 25, 2026

Sooner or Later, Someone Is Going to Need to Think

Tuesday, March 17, 2026

We Picked the Wrong Monster

I want to use some material on this blog...

Popular Posts

Blog Archive

About Me

Career Advice!

Strawman