Drift DetectionDevOpsConfiguration

Config Rot Killed Your Agent's Performance and You Blamed the Model

Config rot isn't just a server problem anymore. When your AI agent configurations drift silently over weeks, the symptoms look like model hallucination — but the root cause is pure infrastructure neglect.

Robert Paul Baquing & Claude·6 min read·May 11, 2026

Config rot is the quiet failure mode that experienced engineers have learned to recognize in production systems. It happens when a configuration file that was intentional and correct at creation time gradually accumulates small, undocumented changes until it no longer reflects technical reality. The config doesn't fail loudly. It fails subtly, in edge cases, in production, on a Friday.

Most developers have experienced config rot in their servers — running a Terraform plan against a long-lived environment and seeing dozens of unexpected diffs, or discovering a security group rule that was quietly removed six months ago.

The same failure mode is now emerging in AI agent configurations. And it's harder to detect because the failure looks like something else entirely.

Why Agent Drift Looks Like Hallucination

When a server's configuration drifts, the server fails. You get a 500, a timeout, or a connection refusal. The signal is clear.

When an AI agent's configuration drifts, the model doesn't fail. It adapts. Claude is a capable model and it will try to be helpful with whatever instructions it currently has.

The core trap: Agent config rot symptoms are blamed on the model, but the cause is in the configuration file.

If the security rules were quietly removed from your code-reviewer agent, Claude will keep performing code reviews — it just won't apply the security criteria anymore. If someone shortened the system prompt by removing the "scope restrictions" paragraph, the model will still review code. It will also start making suggestions in areas it was previously constrained from touching.

A developer notices the agent's output feels inconsistent — sometimes it catches security issues, sometimes it doesn't. They open tickets about model quality. They waste days debugging prompts, not configuration files.

The Three Phases of Configuration Rot

Agent configuration rot typically progresses through three identifiable phases:

Phase	What happens	Why it's invisible
1. Accumulation	Small individual changes — a tool addition here, a description edit there. Git history fills with vague messages like "update agent config"	Each change seems harmless in isolation
2. Incoherence	The accumulated changes produce a configuration nobody specifically chose. Agent has tools that conflict with its rules. Prompt references resources that no longer exist	No single change appears to be the problem
3. Invisible Failure	Configuration is incoherent but still produces output. The developer calibrates expectations to the drifted behavior without realizing it	The drifted state becomes the "baseline"

The third phase is the most dangerous. Once you've unconsciously accepted the degraded behavior as normal, there's no trigger to investigate. The rot becomes permanent.

Diagnosing Drift with xcaffold status

xcaffold treats agent configurations as compiled artifacts. Your .xcaf source files are the single source of truth, and xcaffold apply compiles them into the provider-native directories (.claude/, .cursor/, .gemini/, .github/, .agents/). The state of every compiled artifact is tracked via SHA-256 hashes in .xcaffold/project.xcaf.state.

xcaffold status compares those recorded hashes against the actual content of the files on disk right now. Any discrepancy — even a single changed character — is flagged as drift.

A clean workspace looks like this:

Shell

$ xcaffold status

sandbox  ·  last applied 3 days ago

  PROVIDER       FILES   STATUS
  antigravity       28   ✓ synced
  claude            90   ✓ synced
  copilot            1   ✓ synced
  cursor            54   ✓ synced
  gemini            55   ✓ synced

  Sources  52 .xcaf files  ·  no changes since last apply

✓ All providers are in sync.

Every provider's output matches the compiled state. Nothing has been tampered with since the last xcaffold apply.

Now compare that to a workspace where someone manually edited or deleted files:

Shell

$ xcaffold status

sandbox  ·  last applied 3 days ago

  PROVIDER       FILES   STATUS
  antigravity       28   ✓ synced
  claude            90   ✗ 1 modified
  copilot            1   ✓ synced
  cursor            54   ✓ synced
  gemini            55   ✗ 1 modified

  Sources  52 .xcaf files  ·  no changes since last apply

Drift detected in 2 providers:

  claude
    ✗  missing   CLAUDE.md  (root)

  gemini
    ✗  missing   GEMINI.md  (root)

→ Run 'xcaffold apply' to restore.
  Run 'xcaffold status --target <name>' for details.

Two providers have drifted. The root-level instruction files for Claude and Gemini are missing — perhaps a git clean removed them, or someone deleted them thinking they were unnecessary. Without drift detection, this would surface as a vague degradation in agent behavior across two different tools.

You can drill into a specific provider to see the full picture:

Shell

$ xcaffold status --target claude

sandbox  ·  claude  ·  applied 3 days ago

  89 synced  ·  1 modified  ·  52 sources unchanged

  ✗  missing   CLAUDE.md  (root)

  Sources  52 .xcaf files  ·  no changes since last apply

→ Run 'xcaffold apply --target claude' to restore.
  Run 'xcaffold status --target claude --all' to see all files.

The fix is a single command: xcaffold apply --target claude. The compiled state is restored from the .xcaf sources, and the hash manifest is updated.

What a Healthy Configuration Lifecycle Looks Like

Config rot is not inevitable. It's the result of a process failure: allowing direct modification of artifacts that should be generated, not edited.

With Harness-as-Code, the lifecycle is enforced at the process level:

All changes flow through .xcaf source files — never directly to .claude/ or .cursor/ files
xcaffold apply compiles the updated sources into provider-native output directories
CI runs xcaffold status after the apply step to validate the compiled state
Pre-commit hooks can block commits that modify files inside managed output directories

The key insight is that .claude/ should behave like /dist or target/ in a build system — generated output that's never manually edited. The source of truth lives in .xcaf files, and the compiled output is a deterministic artifact of that source.

This separation also makes code review meaningful. When a developer opens a PR, the reviewer can look at the .xcaf diff and understand the intent. Reviewing a raw markdown agent file gives you no baseline — you're reading prose and hoping it looks right.

Config Rot at the Multi-Provider Level

The drift problem compounds when you're compiling to multiple providers. A developer using both Claude Code and Cursor needs aligned agent definitions across .claude/ and .cursor/. Hand-maintaining both directories is error-prone even for a disciplined solo developer — and in practice it never stays clean.

With multi-target compilation, a single .xcaf source generates all provider directories simultaneously. Running xcaffold status confirms all outputs are clean in one pass — every provider is listed in the overview table, and any drift is immediately visible.

If one provider's directory is drifted and the others aren't, you know exactly which target's output was tampered with. The overview table tells you at a glance: Claude has 1 modified file, everything else is synced. You don't need to compare directories manually or write shell scripts to diff file trees.

And because xcaffold status uses exit codes — 0 for clean, 1 for drift — it integrates directly into CI pipelines:

Shell

# CI step: fail the build if any provider has drifted
xcaffold status || exit 1

No custom scripting. No regex parsing of output. A single binary check that either passes or fails.

The Terraform Analogy, Completed

If you've used Terraform, this workflow is immediately familiar. Terraform plan shows you what would change. Terraform apply makes the change. Terraform state tracks what exists. If someone manually modifies infrastructure outside Terraform, the next plan reveals the drift.

xcaffold follows the same model for agent configurations:

Terraform	xcaffold
`.tf` files	`.xcaf` files
`terraform plan`	`xcaffold status`
`terraform apply`	`xcaffold apply`
`terraform.tfstate`	`.xcaffold/project.xcaf.state`
Provider resources	`.claude/`, `.cursor/`, `.gemini/`, `.github/`, `.agents/`

The parallel is not superficial. The same engineering principles that make infrastructure-as-code reliable — declarative sources, compiled artifacts, state tracking, drift detection — apply directly to agent configurations. The only question is whether you adopt them before or after config rot has already degraded your agents.

Config rot is a management problem before it's a tooling problem. You can't prevent people from editing files directly unless the architecture makes that the wrong path to take. xcaffold makes the wrong path obvious by creating a clear separation between source (.xcaf files) and output (.claude/, .cursor/). The status command makes rot immediately visible, and the compilation pipeline makes the correct path — edit the source, run apply — trivially easy.

Ready to try xcaffold? Detect drift in your agent configs today. Run xcaffold status.

Get started with xcaffold

Drift DetectionDevOpsConfiguration

Config Rot Killed Your Agent's Performance and You Blamed the Model

Robert Paul Baquing & Claude·6 min read·May 11, 2026

The same failure mode is now emerging in AI agent configurations. And it's harder to detect because the failure looks like something else entirely.

Why Agent Drift Looks Like Hallucination

When a server's configuration drifts, the server fails. You get a 500, a timeout, or a connection refusal. The signal is clear.

When an AI agent's configuration drifts, the model doesn't fail. It adapts. Claude is a capable model and it will try to be helpful with whatever instructions it currently has.

The core trap: Agent config rot symptoms are blamed on the model, but the cause is in the configuration file.

The Three Phases of Configuration Rot

Agent configuration rot typically progresses through three identifiable phases:

Phase	What happens	Why it's invisible
1. Accumulation	Small individual changes — a tool addition here, a description edit there. Git history fills with vague messages like "update agent config"	Each change seems harmless in isolation
2. Incoherence	The accumulated changes produce a configuration nobody specifically chose. Agent has tools that conflict with its rules. Prompt references resources that no longer exist	No single change appears to be the problem
3. Invisible Failure	Configuration is incoherent but still produces output. The developer calibrates expectations to the drifted behavior without realizing it	The drifted state becomes the "baseline"

The third phase is the most dangerous. Once you've unconsciously accepted the degraded behavior as normal, there's no trigger to investigate. The rot becomes permanent.

Diagnosing Drift with xcaffold status

xcaffold status compares those recorded hashes against the actual content of the files on disk right now. Any discrepancy — even a single changed character — is flagged as drift.

A clean workspace looks like this:

Shell

$ xcaffold status

sandbox  ·  last applied 3 days ago

  PROVIDER       FILES   STATUS
  antigravity       28   ✓ synced
  claude            90   ✓ synced
  copilot            1   ✓ synced
  cursor            54   ✓ synced
  gemini            55   ✓ synced

  Sources  52 .xcaf files  ·  no changes since last apply

✓ All providers are in sync.

Every provider's output matches the compiled state. Nothing has been tampered with since the last xcaffold apply.

Now compare that to a workspace where someone manually edited or deleted files:

Shell

$ xcaffold status

sandbox  ·  last applied 3 days ago

  PROVIDER       FILES   STATUS
  antigravity       28   ✓ synced
  claude            90   ✗ 1 modified
  copilot            1   ✓ synced
  cursor            54   ✓ synced
  gemini            55   ✗ 1 modified

  Sources  52 .xcaf files  ·  no changes since last apply

Drift detected in 2 providers:

  claude
    ✗  missing   CLAUDE.md  (root)

  gemini
    ✗  missing   GEMINI.md  (root)

→ Run 'xcaffold apply' to restore.
  Run 'xcaffold status --target <name>' for details.

You can drill into a specific provider to see the full picture:

Shell

$ xcaffold status --target claude

sandbox  ·  claude  ·  applied 3 days ago

  89 synced  ·  1 modified  ·  52 sources unchanged

  ✗  missing   CLAUDE.md  (root)

  Sources  52 .xcaf files  ·  no changes since last apply

→ Run 'xcaffold apply --target claude' to restore.
  Run 'xcaffold status --target claude --all' to see all files.

The fix is a single command: xcaffold apply --target claude. The compiled state is restored from the .xcaf sources, and the hash manifest is updated.

What a Healthy Configuration Lifecycle Looks Like

Config rot is not inevitable. It's the result of a process failure: allowing direct modification of artifacts that should be generated, not edited.

With Harness-as-Code, the lifecycle is enforced at the process level:

All changes flow through .xcaf source files — never directly to .claude/ or .cursor/ files
xcaffold apply compiles the updated sources into provider-native output directories
CI runs xcaffold status after the apply step to validate the compiled state
Pre-commit hooks can block commits that modify files inside managed output directories

Config Rot at the Multi-Provider Level

And because xcaffold status uses exit codes — 0 for clean, 1 for drift — it integrates directly into CI pipelines:

Shell

# CI step: fail the build if any provider has drifted
xcaffold status || exit 1

No custom scripting. No regex parsing of output. A single binary check that either passes or fails.

The Terraform Analogy, Completed

xcaffold follows the same model for agent configurations:

Terraform	xcaffold
`.tf` files	`.xcaf` files
`terraform plan`	`xcaffold status`
`terraform apply`	`xcaffold apply`
`terraform.tfstate`	`.xcaffold/project.xcaf.state`
Provider resources	`.claude/`, `.cursor/`, `.gemini/`, `.github/`, `.agents/`

Ready to try xcaffold? Detect drift in your agent configs today. Run xcaffold status.

Get started with xcaffold