MMNTM logo
Best Practices

Are You Doing Policy Theater?

The Air Canada ruling proved your chatbot IS your company. 88% of enterprises deploy AI, but only 14% have real governance. Here's how to tell if yours is theater or infrastructure.

MMNTM Research
7 min read
#AI Governance#Compliance#Enterprise#Risk Management

In February 2024, an Air Canada chatbot invented a bereavement fare policy that didn't exist. It promised a grieving customer a refund he wasn't entitled to.

When the customer sued, Air Canada argued the chatbot was a "separate entity"—not the company, just a tool. The tribunal rejected this completely. The chatbot is the company. Every output is a legally binding representation.

Air Canada lost. And so did the legal defense that "it's just AI."


The Gap

This should terrify you if you're deploying agents without real governance.

88% of organizations now use AI in at least one business function. But only 14% have enterprise-level governance frameworks.

The Governance Gap

64pt

Difference between AI adoption (88%) and governance readiness (14%)

And it's getting worse. Cisco's latest AI Readiness Index found that readiness actually declined year-over-year—from 14% to 13% "fully ready." As AI gets more capable, organizations are getting less prepared.

The result: 75% experienced an AI-related security breach in the past year.

The gap isn't abstract. It's already causing damage.


Policy Theater vs. Real Governance

Here's what I've noticed about the governance conversation: it's focused on the wrong layer.

The typical enterprise response to "we need AI governance" is:

  • Form an ethics committee
  • Publish responsible AI principles
  • Create a governance framework document
  • Assign someone to "own" AI policy

None of this stops your agent from hallucinating to a customer at 3am.

This is policy theater—governance that looks good on paper but doesn't actually prevent failures. It satisfies the checkbox, not the risk.

Real governance is operational infrastructure. It's code, not documents. It prevents bad outputs from shipping, not just from being against policy.

FeaturePolicy TheaterOperational InfrastructurePopular
OversightEthics board meets quarterlyHITL checkpoints block outputs in real-time
Auditability"Maintain appropriate logs"Every LLM call traced: inputs, outputs, reasoning, latency
Incident responseCommittee reviews post-mortemsCircuit breakers kill runaway agents automatically
ComplianceFramework mapped to EU AI ActImmutable audit trail satisfies Article 14 requirements
AccountabilityRACI matrix existsConfidence thresholds route decisions to humans
Failure modeDiscovered by customer complaintDiscovered by automated monitoring

The Air Canada chatbot had no real-time oversight, no confidence thresholds, no circuit breaker. It hallucinated a policy and delivered it to a customer with complete confidence. That's what policy theater gets you.


The Diagnostic

You're doing policy theater if:

  • Your AI governance is a document, not a system
  • Your ethics board has never blocked a deployment
  • You couldn't produce an audit trail of last week's agent decisions in 24 hours
  • Low-confidence outputs ship to customers without human review
  • You'd learn about a hallucination from a customer complaint, not a dashboard
  • Your incident response plan has never been tested
  • "Human oversight" means someone could theoretically intervene, not that they actually do

You have operational infrastructure if:

  • Every agent output is logged with inputs, reasoning, and confidence scores
  • Outputs below confidence threshold route to humans before shipping
  • Circuit breakers automatically kill agents that loop or exceed cost limits
  • You can reconstruct exactly what happened when something goes wrong
  • Drift detection alerts you when model behavior changes
  • You've tested your incident response in a drill, not just a document
  • Human oversight is a code path, not a policy statement

Count your checks. If you're heavy on the first list, you're doing theater.


Why This Matters Now

The EU AI Act requires human oversight for high-risk AI systems—not as a suggestion, but as a legal mandate. Article 14 specifies that humans must be able to "stop or override" the AI and must be capable of detecting "automation bias."

That's not a policy requirement. That's an architecture requirement.

The enforcement timeline is already running:

  • August 2025: General-purpose AI governance requirements take effect
  • August 2026: High-risk system requirements (including human oversight) fully enforced
  • Penalties: Up to €35 million or 7% of global turnover

Organizations with policy theater will fail audits. Organizations with operational infrastructure will pass.


What Infrastructure Actually Looks Like

JPMorgan's Model Risk Governance framework treats AI agents like junior analysts: independent validation, human confirmation for high-stakes decisions, real-time monitoring. Not a policy. A system.

ServiceNow uses "human-at-the-helm" patterns—before an agent modifies data or sends a communication, the system pauses for human confirmation. It's a code path that runs on every request, not a governance document that sits in SharePoint.

The HITL Firewall pattern delivers this at scale: >85% confidence auto-approves, 70-85% gets fast-track review, <70% gets full escalation. Result: 85% cost reduction at 98% accuracy.

This isn't about slowing down. It's about building the controls that let you speed up safely.


The Bottom Line

The governance gap is real: 88% deploy, 14% govern, and readiness is declining.

But the harder truth is that most of that 14% is theater. Documents and committees that don't actually prevent the Air Canada scenario.

Real governance is operational infrastructure:

  • Audit trails that actually log every decision
  • Human oversight that actually reviews risky outputs
  • Circuit breakers that actually stop failures
  • Monitoring that actually catches drift

The question isn't whether you have a governance framework. It's whether your governance would have stopped the Air Canada chatbot from promising a refund that didn't exist.

If the answer is no, you're doing theater.