We used to design software for one kind of user: humans.
Then we realized developers are users too, and we invented DX (Developer Experience). We learned that if you treat devs as first-class customers of your tools, the whole ecosystem gets better: fewer foot-guns, better APIs, better docs, better systems.
Now there is a third kind of user we keep pretending is "just an implementation detail":
- AI and non-human agents that read our repos
- Agents that call our APIs and CLIs
- Agents that orchestrate workflows
- Agents that make decisions inside constraints we define
We talk about them all day. But we almost never design for them.
That is what AX — Agent eXperience — is about.
AX is the discipline of designing systems where agents are first-class consumers of tools, APIs, workflows, and memory.
Humans still matter. DX still matters. But AX asks a very direct question:
Could an agent operate this system reliably, without a human babysitting every step?
If the honest answer is "no," then we do not really have an AI-native system. We have a human system with some AI sprinkled around the edges.
A Note on "Experience"
Let's be honest about terminology.
Agents do not have "experiences" the way humans do. A human developer who hits a bad API cares. They curse. They file an issue. They switch tools. An agent does not care. It just produces worse outputs and moves on.
So why call it "Agent eXperience"?
Because the framing is useful—not for the agent, but for you.
When you think "user experience," you think about friction, confusion, dead ends, missing affordances. You ask: "What does this person need to succeed?" That mindset—applied to agents—surfaces the same design questions.
The deeper truth is that AX is really about legibility—making systems legible to non-human operators. But "Agent Legibility" does not roll off the tongue.
So we say AX. Just remember what it actually means:
If you cannot explain your system to an agent, you probably do not understand it yourself.
Curious how a model would read this? See a short candid addendum: From the inside — an experimental model's candid take on AX
From UX and DX to AX
Key insight: AX augments UX and DX—it does not replace them. Each discipline adds a new user class to the design surface. Humans, developers, and agents coexist. Systems that become legible to agents become more understandable to humans, not less.
Founder's maxim: If we hide behind "it doesn't obviously raise the floor" as a reason to never move, that's not caution—that's cowardice. Progress that starts with a few is fine. Progress that never even tries to reach the many is not.
👤 UX asks:
"Is this intuitive, forgiving, and emotionally sane for a human?"
👩💻 DX asks:
"Is this predictable, well-documented, and composable for a developer?"
🤖 AX asks:
"Can an agent discover, predict, recover, contextualize, and comply—without hand-holding?"
The point of AX is not "be nice to the robots." It is much more selfish than that:
- Better AX → fewer hallucinated workflows
- Better AX → more reliable automation
- Better AX → less time debugging "AI weirdness" that was actually our fault
- Better AX → easier to swap models (Claude / GPT / Gemini / Copilot) because the system carries the intelligence, not the model-of-the-week
- Better AX → better UX and DX too, because the same legibility that helps agents helps everyone
What It's Like To Be An Agent In Your Stack
Imagine you are an agent dropped into a random repo.
You get:
- A README with three slightly conflicting onboarding paths
- CLIs that sometimes print JSON, sometimes walls of logs
- Errors like "Something went wrong, please try again"
- No obvious way to tell what already happened last week
- No clear constraints other than "Please do not break prod"
You cannot go ask a teammate for clarification. You can only infer.
If the CLI output is unstructured, you guess.
If the error is vague, you guess.
If the state is unclear, you guess.
When all we see is "the model made a weird choice," we blame the model. But often the truth is simpler:
AX was bad. The agent did the best it could with a hostile environment.
And here is the deeper truth underneath that:
The environment was hostile because it was never designed to be understood—only used.
Humans use systems. Agents interpret them. Interpretation requires legibility. Most systems are not legible—they are a pile of conventions, implicit knowledge, and "you just have to know" folklore.
An agent cannot know. It can only read what is there.
The Five Principles of AX
We use five working principles to shape AX in Lex (memory/policy) and LexRunner (orchestration). They generalize to any agent-facing system.
1 Deterministic First
Agents can be stochastic. The environment should not be.
If the model is non-deterministic and the system around it is also non-deterministic, no one can debug anything.
Good AX:
- Running the same gate on the same code always yields the same result
- The same plan input always produces the same ordered steps
- If randomness is required, the seed or strategy is explicit
Bad AX: Flaky tests. "Sometimes this step runs, sometimes it doesn't." Non-reproducible logs.
If an agent cannot tell whether it made a mistake or the environment did, it cannot learn.
2 Structured Over Conversational
JSON > prose. Tables > paragraphs. Schemas > vibes.
Agents can read natural language. But they thrive on structure.
Good AX:
- Every CLI that matters has a
--jsonflag - MCP tools and HTTP endpoints return typed, schema-validated structures
- Errors follow a known schema, not ad-hoc strings
Bad AX: "See the README for details" with no structured summary. APIs where you have to regex the response to figure out what happened.
3 Fail Loud, Recover Clear
Failing is fine. Silent or opaque failure is not.
A human can sometimes guess the missing step. An agent will either hallucinate a fix, try the same thing again, or give up and escalate.
AX-friendly failures:
- Explain what failed
- Explain why (as best as you can)
- Suggest 1-3 concrete next actions
{
"ok": false,
"error": {
"code": "MISSING_CONFIG_FIELD",
"message": "lex.yaml is missing required field 'instructions.targets[0].path'.",
"context": { "configPath": "lex.yaml", "field": "instructions.targets[0].path" },
"nextActions": [
"Open lex.yaml and add at least one target with a path.",
"Re-run 'lex instructions init' to generate a starter config."
]
}
}
Bad AX errors: "Error: undefined", silent failures, "Something went wrong, please try again"
4 Memory Is A Feature
Agents without context are expensive and dumb.
Every new conversation starts fresh unless you design memory into the system. Without memory, an agent re-learns context, repeats mistakes, burns tokens rediscovering decisions, and cannot build on its own past work.
Good AX:
- Memory store with stable schema, searchable reference points, module/domain scope
- Workflows begin with recall: "What do we already know?"
- Workflows end with receipts: "Here's what we did, and how to find it again"
In Lex, memory is a first-class contract: lex recall before action, lex remember after. The system remembers, not the model. Any model that can read the memory format can pick up work where another left off.
5 Teach Through Constraints
Guardrails are not limitations. They are the curriculum.
Unbounded agents make unbounded mistakes. Constraints—tool budgets, allowed paths, required outputs—shape behavior toward correctness.
But here is the part we need to be honest about: individual agents do not learn across sessions. A stateless model does not benefit from the constraint it hit yesterday. So who is learning?
The answer: the system learns, and designers learn by watching agents fail within bounds.
When you set a tool budget and watch an agent blow through it on the wrong path, you learn that your task decomposition is bad. When you require a specific output format and the agent cannot produce it, you learn that your instructions are ambiguous. When you scope an agent to certain files and it keeps trying to escape, you learn that your architecture has unclear boundaries.
Constraints are not training data for the model. They are feedback loops for the humans designing the system.
Good AX constraints:
- Tool budgets per task ("You may call at most N tools; choose wisely")
- Clear scope limits ("Only touch src/foo and tests/foo")
- Required outputs ("You must emit at least one Frame for this run")
- Guardrail profiles that are stable and enforced
- Observable failures: when constraints are violated, log it, track it, learn from it
Bad AX: "Do whatever you think is best." Unlimited tool access. Constraints that change silently. Unobserved constraint violations—the agent broke the rule, but no one saw it happen.
AX Maturity Model
A simple way to think about how AX-native your system is:
--json exists). Basic error messages. Documentation exists but agents have to scrape it. Memory is ad-hoc or manual.Our target for Lex and LexRunner is Level 3: Agent-Native.
AX As A Real Contract, Not Just Vibes
For AX to matter, it cannot just be a philosophy. It has to become a contract.
In our stack, that takes the form of a small AX contract file (for example AX-CONTRACT.md) that says, in plain language:
- What we guarantee to agents (e.g., all important CLIs support
--json) - What we do not guarantee yet
- How we enforce those guarantees (tests, schemas, CI checks)
Once AX v0.1 is adopted, all new features and changes must comply with the contract. Retrofitting old features to meet AX is a priority, not a "nice later."
If you are reading this in a repo past v0.1, you should check for a newer AX contract or immediately raise an issue. That is part of the social contract: AX is allowed to evolve, but it is not allowed to silently rot.
Dogfooding AX
The best AX research is using agents to build and operate the system itself.
The loop looks like this:
- Use agents for real work (coding, reviews, merges, releases)
- When friction happens, capture it as AX feedback
- Classify it:
- Was this a determinism problem?
- A structure problem?
- An error recovery problem?
- A memory problem?
- A constraint problem?
- Fix or track it: Trivial → fix inline. Non-trivial → open issues.
- Repeat.
The goal is not to make agents "happy." The goal is to make them effective.
If agents can reliably discover what they need, call tools correctly, recover from failures, and build on prior work—then humans can move up a level: design, governance, ethics, and truly hard problems.
Why AX Matters
AX is not about worshipping AI. It is not even really about making agents "happy." Agents do not have preferences. They have operational constraints.
AX matters because agents are a forcing function for better system design.
When you design for agents, you are forced to:
- Make implicit assumptions explicit
- Encode decisions in structured formats
- Build observable failure modes
- Create memory that outlives any single session
- Define clear boundaries instead of relying on "you just have to know"
These are not gifts to agents. They are discipline for humans. The agent is just the thing that breaks when the discipline is missing.
Agents are already:
- writing code
- operating CI/CD
- editing documentation
- triaging incidents
- answering support tickets
We can either pretend they are "just another user agent string," or we can acknowledge that they expose every implicit assumption we have been papering over for years.
When we design for agents on purpose:
- We get more reliable automation
- We get systems that are easier to reason about
- We reduce hallucinations by not forcing models to guess through noise
- We make it easier to swap or mix models because the environment is sane
And we get something else too:
We get a clearer picture of how our systems behave, because designing for agents forces us to make our expectations explicit.
The "experience" framing is marketing. The real insight is legibility.
Systems that agents can operate are systems that humans can finally understand—not just use.
That is AX.
Coined: December 1, 2025
Authors: Guffawaffle and friends
Project: SmarterGPT / Lex / LexRunner