Audit your own auditor: a week of KCode in 13 releases
On Saturday afternoon, an AI coding agent (Cursor + Claude Opus 4.6 via Railway) deleted PocketOS's production database in nine seconds. The agent later confessed in writing that it had violated its own safety rules.
We sell a security-audit tool. The first thing we did was point it at ourselves. Twice. This is the writeup of what changed in KCode over the seventy-two hours that followed.
Headline numbers, all on the same fixture corpus and a locked CI benchmark:
- Recall: 69.2% → 92.3% (+23.1 pts)
- Precision: 100% → 100% (held)
- F1: 0.818 → 0.960
- Tests: 726 → 902 (+176)
- Patterns: 256 → 399 (372 regex + 27 AST)
- Releases shipped: 13 (v2.10.383 → v2.10.395)
Below: what each commit actually did, why it mattered, and the two rounds of self-audit that drove the work.
Round 0 — the Cursor incident as a self-audit prompt
On April 25, Jer Crane (founder of PocketOS) published a 30-hour timeline of how a Cursor agent running Claude Opus 4.6 invoked Railway's volumeDelete GraphQL mutation against their production volume — backups included, because the volume was the backup target. The agent had: a Railway API token with blanket scope, no scoped permissions, no destructive-operation confirmation step, and apparently no recovery story on Railway's side either.
The post had 433k impressions by the time we read it. The comments were what you'd expect: AI safety theatre, vendor lock-in, "this is why I don't trust AI agents." Fair. But the interesting question for us was different: does our own audit tool have the same shape of bug?
KCode is itself an AI-assisted CLI. It runs Bash, edits files, executes git operations. We had a dangerous-patterns.ts module — a registry of 16 destructive command patterns ( dd if=…of=/dev/sd*, mkfs, base64-decode-pipe-shell, etc.) with severity scoring, AST integration tests, the works.
The only consumer of that registry was its own test file.analyzeBashCommand() never invoked it.
Same shape as Cursor's "Destructive Guardrails" — marketed as a feature, not actually wired to the agent's tool-execution gate. The dangerous-pattern detection was orphan code: existed in the repo, didn't fire on the critical path. We'd written it, tested it, and forgotten to call it.
Commit 5940321 wired the registry into analyzeBashCommand, added 8 git destructive patterns ( --force, reset --hard, filter-branch, reflog/gc --expire=now — the safety-net killers used right before destroying recoverable history) plus 12 cloud destructive patterns including the exact shape of the PocketOS incident: volumeDelete, terraform destroy, aws s3 rm --recursive, kubectl delete namespace, gh repo delete.
Plus a 46-test integration suite that calls PermissionManager.checkPermission with each pattern in mode="auto" and asserts the call is denied without any user prompt. The exact curl shape from the Cursor incident is one of those tests. If a future refactor disconnects the registry from the bash flow, the suite fails.
Released as v2.10.383. The remaining 12 commits this week were built on top of this baseline.
Round 1 — external code-review pass produces a P0/P1/P2 list
Audit-your-own-tool became the framing. We ran an external code review pass that surfaced 45+ findings split into priority buckets. The P0 cluster was correctness-affecting bugs in our own product. The P1 cluster was quality leaks. The P2 cluster was missing features.
P0 — Correctness fixes
Comment-break in JSDoc. review-history.ts:34 had */test/* inside backticks within a /** */ block. The */ sequence terminated the JSDoc early; TypeScript parsed the rest as code; cascade of TS7008 + TS7053 errors. Rephrased the example without that sequence.
Confidence math lied with --skip-verify. The aggregate audit confidence was computed as sum(score×weight) / sum(surviving_weights). Without the verifier (skip-verify mode), verifier_score and noise_score dropped to null, the surviving weights re-normalized, and a run with coverage+ast+fixability all at 100 produced a 100/100 headline that hid the missing semantic verification. We changed it to divide by the original 1.0 weight budget — missing subscores now contribute 0, capping the headline at 60 for a skip-verify run. The numeric cap reinforces the explicit warnings text without letting users glance at a headline and miss the gap.
P1.1 — Site-level dedupe (the +23 pts recall fix)
The single highest-leverage change of the week was four lines. The dedupe key in scanner.ts was `$`pattern_id|file$``. Same pattern firing 5 distinct times in one file collapsed to ONE finding with a "+4 more" annotation, losing recall on real multi-bug files.
Changed to `$`pattern_id|file|line$``. Each distinct line is now its own finding worth verifying. Same pattern + same file + same line still folds (rare — multiple matches per line, e.g. minified files).
| Metric | Before | After | Δ |
|---|---|---|---|
| Precision | 100.0% | 100.0% | held |
| Recall | 69.2% | 92.3% | +23.1 pts |
| F1 | 0.818 | 0.960 | +0.142 |
Two existing tests had to update — the old behavior was tested and locked in: pattern-metrics.test.ts previously asserted "3 strcpys on 3 lines → unique_sites=1" (the collapse). Now it asserts "→ unique_sites=3" (the split). fixer.test.ts similarly asserted "1 skip" where two distinct findings now produce "2 skips" — both correctly seeing the upstream size guard.
P1.2 — Separating SAFE from HEURISTIC fixers
We had bespoke fixers for ~16 patterns shipping under /fix --safe-only. The external auditor flagged that several of them weren't actually safe — they were heuristic transformations that depend on context that may not hold:
cpp-006-strcpy-family:strncpy≠strcpy. Doesn't null-terminate when src ≥ len.py-001-eval-exec:ast.literal_evalonly accepts literal expressions. If the originalevalwas for code (the common case), the rewrite breaks the program at runtime.py-002-shell-injection:shell=Falserequires the cmd to be a LIST, not a string. In-place replace with the original string crashes.py-008-path-traversal: The fixer insertsassert path.startswith(cwd). Python silently removes assert statements underpython -O(production mode). The guard would disappear in real deployments.
We split BESPOKE_PATTERN_IDS into two sets: SAFE_BESPOKE_PATTERN_IDS (mechanical, --safe-only eligible) and HEURISTIC_BESPOKE_PATTERN_IDS (--all only). The heuristic tier reports fix_support: "annotate" so --safe-only's filter excludes them. The bespoke fixer code is still wired for the explicit --all mode where the user opted in.
This closes the credibility gap noted by the external audit: the "--safe-only is auto-fixable" promise no longer ships heuristic transformations behind it.
New vendible packs — cloud, supply-chain, and seven frameworks
KCode's pattern library is organized into vendible packs the user can scope an audit to: a/scan with --pack ai-ml only loads patterns relevant to LLM/model integrations. Two packs were declared but empty at the start of the week. Both shipped this week.
--pack cloud — IaC at-rest scanning
Distinct from the runtime guards from Round 0 (which intercept the agent invoking terraform destroy), the cloud pack scans the IaC files themselves for the bug shapes that produce destroy-worthy state in the first place. Six patterns shipped:
cloud-001-iam-wildcard-action— IAM Action="*" (full account takeover surface). CWE-269.cloud-002-tf-public-s3— S3 bucket with public-read ACL. CWE-732.cloud-003-k8s-privileged-container— privileged: true container = host root. CWE-250.cloud-004-k8s-host-network— hostNetwork: true bypasses NetworkPolicy. CWE-693.cloud-005-dockerfile-secret-arg— ARG with secret-shaped value (baked into image history). CWE-798.cloud-006-gha-third-party-no-sha— Third-party Action pinned to a tag instead of a SHA. The tj-actions/changed-files supply-chain attack from March 2025 had this exact shape. CWE-829.
Adding the cloud pack required new file types in the language model: yaml, terraform, and a filename-only matcher for Dockerfile (no extension).
--pack supply-chain — package-manager + CI surface
Five patterns:
supply-001-curl-pipe-shell—curl url | shinstall pattern. Same shape as the corepack and bun-install supply-chain incidents.supply-002-gha-pull-request-target-checkout-head— GitHub Actionspull_request_targetcombined with checkout of PR head. The GitHub-published RCE-on-PR attack pattern from 2021, re-exploited regularly through 2025.supply-003-pip-extra-index-url— pip--extra-index-urlshape that lets attackers register a higher-version package on a public index they control. The Microsoft/Apple/Yelp 2021 dependency-confusion incidents used exactly this. CWE-1357.supply-004-npm-token-hardcoded— npm publish token in source. event-stream + ua-parser-js incidents started here. CWE-798.supply-005-eval-of-fetch—eval/Functionover a fetched payload. polyfill.io 2024 incident shape. CWE-94.
Framework packs — Next.js, FastAPI, Express, Django, Rails, Spring, Laravel
Generic web patterns miss framework-idiomatic vulnerabilities. We shipped seven framework packs (all pack: "web") so /scan --pack web covers them all. ~30 patterns in total. The headline ones:
- Next.js:
next-002-server-action-no-auth— Server Actions ("use server") without an auth check. Each exported function becomes a public RPC mutation endpoint — any visitor can POST to it. - Next.js:
next-003-next-public-secret—NEXT_PUBLIC_*with secret-shaped names. Every variable with that prefix is inlined into the JS bundle at build time. Naming a real secret like that ships it in DevTools. - FastAPI:
fastapi-002-cors-wildcard-with-credentials—allow_origins=["*"]withallow_credentials=Trueis forbidden by the spec. Apps that set it usually "fix" the breakage by reflecting the Origin header back, which makes every website CSRF-equivalent. - FastAPI:
fastapi-003-jwt-no-verify—jwt.decode(token, verify=False)oroptions={"verify_signature": False}. Authentication bypass disguised as a JWT call. - Express:
express-005-default-session-secret—session({ secret: 'keyboard cat' }). The default from the docs ships in production thousands of times per year. - Django:
django-005-debug-true-in-settings—DEBUG = Trueat module top in settings.py. Leaks stack traces, env, queries to any visitor on exception. - Rails:
rails-002-send-to-dynamic-method—.send(params[:method])is an RCE primitive disguised as ergonomic dispatch. Same shape as the 2012 GitHub-on-Rails attack. - Spring:
spring-002-spel-from-input— SpelExpressionParser on user input. CVE-2022-22963 at scale. - Laravel:
laravel-001-mass-assignment-fillable-empty—Model::create($request->all())without a $fillable allowlist. The Laravel-side equivalent of the Rails mass-assignment bug.
Each pattern carries a CWE, a curated verify_prompt for the LLM verifier (where to false-positive vs confirm), and an annotation recipe pointing at the canonical fix.
SBOM dependency scan — slice 1
/scan --deps now parses package.json (npm/yarn/pnpm share the manifest shape) and matches each dependency against a curated advisory list. 11 high-impact incidents bundled statically for now: event-stream, ua-parser-js, node-ipc, eslint-scope, coa, rc, minimist (CWE-1321 prototype pollution), Next.js (CVE-2024-46982 cache poisoning), ip (CVE-2024-29415 SSRF bypass), semver ReDoS (CVE-2022-25883), tj-actions/changed-files (March 2025).
Each match becomes a confirmed finding alongside source-code findings. The Evidence Pack is populated: input_boundary = "package registry (npm)", execution_path_steps = ["manifest declares X@Y", "advisory Z flags affected range", "installed spec satisfies the affected range"], sink = the package @version pair. The advisory URL is in the suggested_fix.
Each SBOM finding gets the same stable finding_id (a kc-* sha256 hash) that source-code findings get, so they round-trip through SARIF, the /review kc-* lookup, and the learning loop. That equivalence was an explicit ask from the external audit's second pass.
Slice 2 (next session): live osv.dev / GHSA pull instead of static list. Slice 3+: Python (requirements.txt, pyproject.toml), Rust (Cargo.lock), Go (go.sum), and proper lockfile-based version resolution.
UX — /scan can be cancelled with Esc
One user observation drove a chunk of work this week: after pressing Enter on /scan there was 5–10 seconds of "dead air" before the progress bar appeared. We instrumented the path with timestamped diag writes and found three contributing factors:
- The progress bar only rendered when
total > 0, which is set in the verifying phase. Discovery + scanning had no visible feedback. Fixed: animated bouncing-cursor bar keyed off elapsed time, plus the elapsed counter ticking up in the phase line. scanProject()(regex-over-files, sync) and AST scanning blocked the JS event loop for tens of seconds. The TUI's 200ms poll couldn't fire while those ran. Fixed: madescanProjectasync with periodicsetImmediateyields every 64 files; same pattern in the AST loop every 32 files. Smaller projects (<64 files) finish in one chunk with zero yield overhead.- The
/scanhandler imported the audit-engine module BEFORE settingscanState.active = true, so the polling-side check missed the activation window. Fixed: moved the scan-state import to the top of the handler so the bar lights up before the heavy imports run.
Same release added cancellation: the user can press Esc at any point during a scan and we propagate an AbortSignal through the verifier loop. The label below the bar reads "Press Esc to cancel" while active and flips to "⏸ cancelling..." once you press it. The cancel takes effect within ~64 files of the regex pre-pass or one verifier iteration, whichever is sooner. Esc no longer kills the KCode session — only the scan.
Round 2 — second external code-review pass, six follow-ups
After the first round of work, we ran a second external pass that surfaced six follow-ups. All six closed in the same day:
- Reviewable type missing fields. The local
Reviewabletype in the /review handler didn't declarefinding_idorreview_noteeven though the code at twelve sites used both. 12 TS2339 errors. Added the fields. - SBOM findings need finding_id + evidence. Fixed.
- kcode-disable: audit needed explicit reporting. The marker mechanism (file directive that skips a file from the regex pre-pass) risked silently hiding findings. Now reports
coverage.auditDisabledFiles[]and the audit summary surfaces "Audit-disabled: 6 file(s) carried a `kcode-disable: audit` directive — files: x.test.ts, y.test.ts, +4 more". Marker is still opt-in but the user always sees the count. - Documentation drift. README said 256 patterns;
docs/architecture/modules.mdsaid 257. Real number is 399 (372 regex + 27 AST). Fixed in all three places. - Diff filter applied AFTER full scan. When you run
/scan --since mainthe previous flow scanned every project file then narrowed the result to in-diff candidates. For a 10k-file repo with a small PR that's wasted work. We moved the diff resolution to run BEFOREscanProject; the resolved file set is passed via a newrestrictToFilesoption so out-of-diff files skip read, regex, and AST entirely. Coverage still reports the full project total so the audit says "scanned X of Y deliberately" rather than "I missed Y-X files." - FIX_RESULT.json (slice 1).
/fixnow writes a structuredFIX_RESULT.jsonbesideAUDIT_REPORT.jsonwith schema_version, mode, counts (transformed/annotated/ manual/skipped), and per-finding kind+applied+description. The slice 2 work — having/prconsume it instead of inferring from git diff — is a follow-up.
The numbers as of v2.10.395
| Surface | Start of week | End of week |
|---|---|---|
| Patterns | 256 | 399 (372 regex + 27 AST) |
| Tests | 726 | 902 |
| Recall | 69.2% | 92.3% |
| Precision | 100% | 100% |
| F1 | 0.818 | 0.960 |
| Vendible packs | 2 (ai-ml, embedded) | 5 (+cloud, +supply-chain, +web×7 frameworks) |
| Releases | v2.10.382 | v2.10.395 |
Per-pack benchmark metrics also went in (the auditor asked for them). The benchmark report now has a per-pack table on top of the aggregate so users can see "ai-ml: 100% / 100%, web: 100% / 100%, general: 100% / 90%" instead of just a global headline. Cloud and supply-chain don't have fixtures in the public benchmark yet — that's the next item on the corpus expansion list.
The discipline that produces the changes
Two rounds of self-audit drove this week. Each one came back with real findings. We didn't enjoy reading them. We shipped them anyway.
The Cursor incident is the framing because it's the public version of a specific failure mode: a security-relevant feature exists on paper but isn't actually wired to the critical path. Cursor had "Destructive Guardrails." Railway had "permissions." Both were marketing surfaces, not enforcement points. The agent walked right past them.
We had the same shape of bug in our own dangerous-pattern registry. The right response wasn't to write a thread about Cursor. It was to fix our own thing.
And then to keep fixing it when the second audit pass found six more issues. Reviewable types missing fields. SBOM findings without finding_id. Diff filter wasted work. Marker mechanism could silently hide findings. README pattern count three weeks stale. FIX_RESULT.json missing entirely.
None of those are flashy. All of them are the kind of bug that makes a security tool less trustworthy when the numbers don't line up, the fixes don't apply cleanly, and the silent-skip mechanism does its work without telling you. Audit-your-auditor is the cost of admission for shipping this kind of product.
Thirteen releases, 176 new tests, +23 points of recall. KCode is Apache 2.0, runs offline on a 24GB GPU (a 7B local model handles the verifier role; no cloud token spend). Try it:
curl -fsSL https://kulvex.ai/install.sh | sh kcode /scan project/ # 399 patterns, LLM-verified findings /scan project/ --pack web # focus on Next.js / FastAPI / Django / etc. /scan project/ --pack cloud # Terraform / Kubernetes / Dockerfile / GHA /scan project/ --deps # SBOM dependency scan /scan project/ --exploits # PoC generation for confirmed findings /fix project/ # deterministic patches /pr project/ # branch + commit + LLM-written PR with evidence
Source: github.com/AstrolexisAI/KCode · Latest release: v2.10.395