ai-agents-2-2 Agents SDK | OpenAI API Reference for orchestrating multi-step, tool-using agent systems with explicit application control. Noah Bennett 9 Jun 2026 · 1 min read
ai-agents-2-2 ProgramBench: Can Language Models Rebuild Programs From Scratch? Turning ideas into full software projects from scratch has become a popular use case for language models. Agents are being deployed to seed, maintain, and grow codebases over extended periods with minimal human oversight... Noah Bennett 5 May 2026 · 1 min read
ai-agents-2-2 Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly? Large Language Models (LLMs) are reshaping almost all industries, including software engineering. In recent years, a number of LLM agents have been proposed to solve real-world software problems. Such software agents are... Noah Bennett 17 Nov 2025 · 1 min read
ai-agents-2-2 SWE-Dev: Building Software Engineering Agents with Training and Inference Scaling Large language models (LLMs) have advanced rapidly from conversational problem solving to addressing real-world tasks involving tool use, such as software engineering (SWE). Recent LLM-powered toolkits, such as OpenAI Co... Noah Bennett 9 Jun 2025 · 1 min read
ai-agents-2-2 Tree of Thoughts: Deliberate Problem Solving with Large Language Models Planning and search strategy that informs branching and verification-heavy agent workflows. Noah Bennett 17 May 2023 · 1 min read