All research
2026-05-15·5 min·quality · architecture · evaluation

Self-critique loop · how the Mythic tier sharpens before it streams

Mythic-depth readings go through a three-stage write → critique → revise loop · primary model drafts, Haiku 4.5 reviews, primary rewrites. Roughly 1.7× the cost of a single-pass call · gated to the tiers whose plan headroom absorbs it.

Self-critique loop · how the Mythic tier sharpens before it streams

When a user picks "Mythic" depth, we don't just give them a more expensive model. We give them a different pipeline.

The three passes

  1. Primary model drafts — Opus 4.7 with adaptive thinking + xhigh effort writes a full reading. We buffer this output server-side instead of streaming it to the client. Tool-injected markers (the cards / chart / hexagram visuals) still go through immediately so the user sees them within a few seconds.

  2. Haiku 4.5 reviews — the buffered draft is handed to Haiku with a focused critique prompt: identify 2–4 specific problems (translation-stiff Chinese, off-topic, over-hedged, generic, repetitive) and quote the offending fragments. Haiku writes notes, not rewrites.

  3. Primary model revises — the draft + Haiku's notes go back to Opus 4.7. It rewrites the reading per the notes and streams the revised version to the user.

Why a different model for critique

Same-model self-critique has a known blind spot: the writer model tends to defend its own choices. A cheaper, faster, differently-trained model reviewing the same output catches things that the writer is structurally prone to miss — verbose paragraphs that feel like padding, sentences that hedge into vagueness, cliched closing remarks, and (the big one in Chinese) syntax that reads like a translation rather than native Chinese metaphysics register.

Haiku 4.5 is a good pick for this role: it costs ~$0.005 per critique, returns in under a second, and has different stylistic biases than its larger siblings.

What it costs

Roughly 1.7× a single-pass Mythic call. Most of that is the second Opus pass; Haiku critique is rounding error. This is why we gate the loop to Mystic / Founder / Mythic tiers — their plan headroom absorbs the spend, and they're the cohort where premium quality is the product.

What it improves

We don't have published benchmarks yet — we hold off on publishing numbers until the eval framework is shipped and we've run a few hundred A/B comparisons. Anecdotally during internal testing the revised output is meaningfully more specific to the user's actual question and noticeably less hedged. Eval data will replace this paragraph when it lands.

What it does NOT do

The critique pass is not a safety filter — Anthropic's own safety and our pre-stream gates (runSafetyCheck, banned-words list, crisis classifier) all run before the loop ever starts.

It's also not a fact-checker. Computational results (BaZi pillars, hexagram numbers, planetary positions) all come from our deterministic compute pipeline before the LLM is even called. The critique pass cares about voice and depth, not arithmetic.

Where this fits in the stack

  • Free / Wanderer / Pro / Premium users: single-pass.
  • Mystic / Founder / Mythic users at depth = "mythic": three-pass loop.
  • Mystic / Founder / Mythic users at depth = "deep": single-pass with adaptive thinking but no critique loop.

The pipeline switches at runtime per request. The Haiku critique runs only when the user actually opts into mythic depth via the depth picker.


Posted 2026-05-15 · Astrael Engineering. We're an AI company — this is what we do.

Research collaboration or feedback? research@astrael.ai.