STRATEGIC THINKING WEEKLY

Strategic Thinking Academy Edition

AI can generate infinite content. The scarce resource is knowing what's worth implementing. This issue teaches you to catch failures before you waste time building on sand.

The Consulting Firm That Built the Wrong System

A mid-size consulting firm asked AI to help improve their proposal process. They got exactly what they asked for: a detailed 47-step workflow with templates, checklists, and automation suggestions.

Three weeks later, they'd implemented about 60% of it. Proposal creation time had actually increased. Team members were frustrated with extra documentation requirements. The system was technically complete but practically useless.

What went wrong?

The AI output passed the "sounds reasonable" test. It passed the "comprehensive coverage" test. It even passed the "expert-sounding language" test.

But it failed the only tests that matter: the three judgment gates that separate actionable frameworks from impressive-sounding advice.

The firm had asked "How can we improve our proposal process?" That's not a decision. It's an invitation for AI to generate infinite suggestions with no way to evaluate which ones actually matter.

After the failed implementation, we reframed the problem: "Should we prioritize proposal speed or win rate, given that we lose 70% of proposals we submit?"

That reframe changed everything. Suddenly the question wasn't "how do we document more thoroughly?" It was "why are we submitting proposals we can't win?"

The real solution had nothing to do with workflow optimization. They needed a qualification framework to stop wasting time on proposals they were going to lose anyway.


Why AI Outputs Fail the Framework Test, And How to Catch It Early

In Issue #1, we introduced the 3-tests framework. Here's the deeper logic behind why these tests work as judgment gates, and how each failure mode plays out in practice.

Test 1: Is the problem defined as a decision?

AI systems are trained to be helpful. When you give them an open-ended problem, they generate comprehensive responses that address every possible angle. This feels thorough but creates a specific failure pattern: the output can't be wrong because it never committed to anything.

"Improve communication" can mean anything. So AI gives you tips for emails, meetings, documentation, feedback loops, team structures, and communication tools. Technically helpful. Practically overwhelming. You end up with a buffet when you needed a prescription.

Decisions force commitment. "Should we move to async communication for project updates?" has a yes or no answer. That commitment creates evaluation criteria. You can measure whether async updates actually worked better than meetings. The framework can succeed or fail in observable ways. If your AI output doesn't answer a specific decision question, it will feel useful but won't be implementable. This is the first failure mode in any serious AI framework evaluation.

Test 2: Is a boundary named explicitly?

Boundaries create constraints that make frameworks testable. Without boundaries, AI generates ideal-state solutions that assume unlimited time, budget, expertise, and organizational cooperation.

"Build a world-class customer success program" sounds great until you realize you have two people and six weeks. The boundary "implementable by our existing team in Q1" eliminates 90% of AI suggestions immediately, which is exactly what you need. If your AI output doesn't acknowledge what can't be done, it's not a framework. It's a wish list.

Test 3: Is the primary metric's penalty clear?

This is the test most people skip, and it's the most important one. Without a clear penalty for failure, there's no way to prioritize tradeoffs. "Increase customer satisfaction" gives no guidance on how to balance satisfaction against cost, speed, or other priorities.

But "Reduce churn rate from 8% to 5% (penalty: each percentage point costs $240K annually)" creates real decision criteria. Now you can evaluate whether a satisfaction improvement is worth $50K in implementation cost. The math becomes possible. AI outputs almost never include penalty calculations because they're not operating under real constraints. Your job is to add the penalty before you evaluate whether the output is worth implementing. Test your pattern recognition with the AI Writing Detective, an interactive tool that trains you to spot generic AI patterns.

The 15-Minute Test That Saved 3 Weeks

Role: Operations director at a professional services firm

Situation: Received a comprehensive AI-generated "client onboarding optimization framework" with 23 improvement recommendations. Team was excited to implement.

Constraint: Only had bandwidth to implement 3–4 changes before Q1 ended.

Intervention: Applied the 3-tests filter before starting any implementation work.

Test Results:

  • Test 1 (Decision): Failed. The 23 recommendations didn't answer "Which changes will reduce time-to-first-value?" They answered "What could theoretically be improved?"
  • Test 2 (Boundary): Partially passed. Some recommendations acknowledged resource constraints. Most didn't.
  • Test 3 (Penalty): Failed completely. No metrics tied to business outcomes. "Better client experience" isn't measurable.

Outcome: Instead of implementing AI suggestions, ran the 3-tests framework on actual operational data. Discovered that 60% of onboarding delays came from one step: waiting for client credentials. One process change, credential collection moved earlier in the sequence, solved the core problem. The other 22 recommendations would have been optimization theater.

What's notable here: The AI output wasn't wrong. It was comprehensive and technically accurate. But it would have consumed 3 weeks of implementation time addressing symptoms while missing the actual cause. The 15 minutes spent applying the judgment gate framework redirected effort toward the single highest-leverage intervention. That's the ROI of systematic thinking: not working harder on the AI output, but knowing which part of it is actually worth your time.

Before acting on any AI output, run it through these questions. If you can't answer all five, the output isn't ready for implementation, no matter how comprehensive it looks.

1. What specific decision does this help me make?
If the answer is "it gives me options" or "it provides information," that's not a decision. Rework the prompt until you have a concrete yes/no or A/B/C choice.

2. What can I NOT do if I follow this?
Every real framework eliminates options. If nothing is ruled out, you don't have a framework. You have a brainstorm. The value is in the elimination, not the list.

3. How will I know if this failed?
If there's no failure condition, you can't learn from implementation. The framework becomes unfalsifiable, which means it's useless for systematic improvement. Define failure before you start.

4. What's the cost of being wrong?
This forces penalty clarity. High-cost failures need more validation before implementation. Low-cost failures can be tested quickly. The answer determines how much pre-work is worth doing.

5. Who will this affect and what do they need to change?
AI outputs routinely ignore implementation reality. If your framework requires behavior change from people who weren't consulted, failure is predictable. Map the people before you build the process.

These five questions take about 5 minutes to answer. They can save weeks of misdirected effort, and they're the difference between using AI for implementation and using it for exploration.

3-Minute Micro-Win

Test something you've already received from AI

Open your last substantive AI conversation
Find something you asked AI to help with in the past week: a plan, a strategy, a process improvement, a recommendation. Anything you're considering acting on.

Apply the 3 core tests
Does it answer a specific decision? Is there an explicit boundary? Is a penalty clear? Be honest. Most outputs fail at least one.

If it fails, try this reframe prompt
"Reframe this as a decision I need to make. What's the specific choice, what are the constraints I'm working within, and what's the cost of getting this wrong?"

Compare the outputs
The reframed version will almost certainly be more actionable, even if it's less comprehensive. Less comprehensive is good. Comprehensive means nothing was ruled out.

This exercise builds the habit of applying judgment gates automatically. Within a few weeks, you'll start structuring questions this way from the beginning, and your AI outputs will be immediately implementable instead of impressive-but-vague.

What's an AI output you implemented that didn't work as expected?

Reply with what happened. I'll analyze it through the 3-tests lens and share patterns (anonymized) in future issues.

mike@ragedesigner.com

Learn the Complete Judgment Framework

The 3 tests are just the beginning. Strategic Thinking Academy teaches the full systematic methodology for building, validating, and transferring frameworks across any context.

Explore Strategic Thinking Academy

or

Book a 30-minute Framework Diagnostic
$997 - credits apply toward Academy enrollment

Book Diagnostic

Or learn to build your own frameworks, a 4-week step-by-step course.

Strategic Thinking Weekly · Published every week
Issues #1–5 are free forever · Subscribe for $17/mo to access Issue #6 and beyond
St. Petersburg, FL

← Issue #2: The Constraint Advantage Issue #4: Framework Archaeology →