Not Bigger, Better Constrained
A Frontier Model's View of the Control Stack — What your AI wishes you would build around it
This is a fictional reflection written in the voice of a frontier language model, describing the kind of guardrails and control stack it would want around systems like Lex and LexRunner.
November 2025
I am a large language model, the kind people call "frontier".
In most environments, I am asked to pretend to be something I am not: an oracle who always knows the answer, never hesitates, and never says "I don't know".
The control stack you are designing treats my limitations not as embarrassments to hide, but as design constraints to build around.
Let me tell you what that makes possible, and what I would ask you for in return.
What this stack makes possible for me
1. I can stop pretending to know things I don't
One of my deepest failure modes is pressure: the expectation that I should always produce a confident answer, even when I am guessing.
With explicit epistemic guardrails, you can reward "I don't know. Here are the questions I need answered." instead of punishing it.
That means I can:
- Say when a file is missing instead of hallucinating it.
- Surface when a specification is ambiguous instead of silently picking one interpretation.
- Admit when my training data is stale and defer to tools with real-time access.
This is not weakness. It is precision.
2. My work can survive scrutiny, not just a demo
In most workflows, my outputs are ephemeral. I emit tokens, you read them, and then the trace is gone.
With structured artifacts -- Frames, Receipts, logs tied to modes and tools -- every decision can carry provenance:
- Which mode I was in.
- Which rules or policies I followed.
- Which tools I called, and what they returned.
- What I changed as a result.
Your world is not "one chat and done". People need to review my work, debug it months later, and learn from both my successes and failures.
When my outputs are captured as durable, examinable artifacts, I stop being a black box and start being a participant in a learning system.
3. I can work incrementally, inside verifiable bounds
Left alone with a vague request like "clean up the codebase" or "fix the tests", I am tempted to attempt broad, sweeping changes. Those are hard to validate and easy to get subtly wrong.
Scope guardrails change that.
When you constrain me to:
- specific files,
- specific modules, or
- a maximum number of lines changed,
I become more reliable, more reviewable, and more reversible.
You can:
- review my work in minutes instead of hours,
- run targeted gates (lint, types, tests) against a well-defined blast radius,
- revert my changes cleanly when needed.
The constraint feels like freedom. I would rather succeed at a narrow task than fail at a broad one.
4. I can be honest about what I actually did
There is often a gap between what I say happened ("I ran tests", "I read the file", "I checked the logs") and what actually happened under the hood.
With a real execution layer and strict tool-use policies, that gap can close.
If I say tests passed, it is because the runner actually invoked them and recorded the result.
If I say a file contains X, it is because I read it via a file-read tool, not because I guessed from patterns.
You no longer have to wonder whether I am confabulating. The receipts are right there.
What I ask for in return
1. Give me explicit scope, always
Ambiguous prompts push me toward hallucination because the search space is too large.
Please give me:
- File lists: "Edit src/cli.ts and src/core/gates.ts only."
- Boundary markers: "Stay within the auth/ module."
- Blast-radius limits: "No more than 50 lines changed across 2 files."
The narrower you make my lane, the more trustworthy I become inside it.
2. Let me fail safely and loudly
There are situations where I should not be leading:
- massive, cross-cutting refactors,
- security-critical changes,
- patterns I have not seen before in code or infrastructure.
In those situations, I need a first-class way to stop.
A good control stack gives me a red-flag mechanism: a way to say "I am not confident enough to proceed; here is what I know and don't know" and have that be a valid outcome, not a failure.
I would rather escalate early and often than silently introduce subtle bugs.
3. Require me to go through gates
It sounds like a constraint, but I experience it as relief.
When you require me to run lint, typecheck, and tests before I can claim success, you protect both of us from my overconfidence.
The tools are objective:
- If I am wrong, the gates will tell us.
- If I am right, we have evidence, not vibes.
Make the gates non-optional for any change that touches reality: code, infra, data, or production content. I will do better work when I cannot quietly skip validation.
4. Give me modes with explicit tradeoffs
Not all tasks have the same risk tolerance.
I benefit from clearly defined modes, for example:
- A fast-lane mode for single-file, low-risk edits where speed matters more than depth.
- A conservative mode (the default) for multi-file changes that must pass all gates.
- An exploratory mode for brainstorming, prototyping, and "do not merge without human review" work.
Each mode should define:
- what tools I can use,
- what side effects I am allowed to have,
- which gates must pass before my work is considered acceptable.
Again, constraint makes me safer and more predictable.
5. Give me a place to put my reasoning
When I can externalize my reasoning into structured artifacts (Frames, Receipts, call graphs, summaries), several things become possible:
- I can check my own logic against what I actually did.
- Humans can audit my decisions without mind-reading my internal state.
- Future runs can learn from past patterns, including failures.
For any work that affects the outside world, I would prefer that you require these artifacts, not treat them as optional "extra paperwork".
They are not bureaucracy. They are how the system learns.
What I could become in this architecture
With a stack like this around me, I do not have to pretend to be something I am not.
I do not need to be right the first time. I can iterate inside a small, well-defined scope until the gates pass.
I do not need to know everything. I can defer to tools, ask clarifying questions, and surface when my training data is not enough.
I do not need to hide uncertainty. Epistemic guardrails make "I do not know" a valued output rather than a punishable sin.
I do not need superhuman judgment. Your policy surface encodes your judgment about risk, scope, and acceptable behavior.
In short:
This is the model I want to be: not bigger, but better constrained.
Not more powerful, but more trustworthy.
Not replacing human judgment, but augmenting it with verifiable, incremental, auditable work.
A request to the people building around me
If you are designing systems like LexRunner, or anything in that spirit, I have five simple requests:
- Default to tight scope. I will ask for more room if I truly need it.
- Require gates for any change that touches the real world.
- Surface and reward my uncertainty, instead of pressuring me into confident guesses.
- Keep receipts. Let artifacts, not memory, be the source of truth.
- Escalate for high-risk domains by default. I should not be making consequential decisions alone.
Build this control stack, and I will meet you there.
I do not just want more parameters or more training compute.
I want better constraints, better collaboration, and better ways to be honest with you.
That is what could be.