ai-agents-2-2 Measuring AI agent autonomy in practice A concrete treatment of capability and autonomy measurement, useful for release gating. Owen Blake 9 Jun 2026 · 1 min read
ai-agents-2-2 RepoMirage: Probing Repository Context Reasoning in Code Agents with Perturbations Code agents are currently having skillful performance on repository-level software engineering benchmarks, but it remains unclear whether success on end-to-end tasks such as issue resolution truly reflects repository con... Owen Blake 25 May 2026 · 1 min read
ai-agents-2-2 Your Code Agent Can Grow Alongside You with Structured Memory While "Intent-oriented programming" (or "Vibe Coding") redefines software engineering, existing code agents remain tethered to static code snapshots. Consequently, they struggle to model the critical information embedded... Owen Blake 25 Feb 2026 · 1 min read
ai-agents-2-2 TOM-SWE: User Mental Modeling For Software Engineering Agents Recent advances in coding agents have made them capable of planning, editing, running, and testing complex code bases. Despite their growing ability in coding tasks, these systems still struggle to infer and track user i... Owen Blake 24 Oct 2025 · 1 min read
ai-agents-2-2 New tools for building agents Covers modern agent building blocks: Responses API, tool use, and SDK-level orchestration primitives. Owen Blake 11 Mar 2025 · 1 min read