AI coding has graduated from autocomplete to agents you supervise. The throughline here: the editor is becoming a work surface, who already lives in your workflow matters more than raw capability, and the hard problem is reliability on execution-heavy, multi-file work β not generating a snippet.
2026-06-11 ai-coding
Jesse Skinner reframes LLM coding agents as an army of rockstar developers: fast output, code nobody can maintain. The real engineering problem isn't speed. It's who's left holding the bag.
Read analysis 2026-06-11 cognition
Cognition's FrontierCode uses 'would the maintainer actually merge this' as its signal, folding readability, scope discipline, and codebase conventions into the score. Closer to human code review than pass rates, but it drags subjectivity in with it.
Read analysis 2026-06-10 moonshot
Kimi Code CLI's built-in coder, explore, and plan subagents matter because they split agentic programming into roles: understand, plan, implement, and report, instead of wrapping a model in a shell.
Read analysis 2026-06-10 moonshot
Kimi Code CLI puts code edits, shell commands, web fetching, and planning into one terminal workflow. That loop can make developers faster, but it also makes permissions, audit, and supervision central.
Read analysis 2026-06-09 google
Antigravity 2.0 drops the IDE and ships as a standalone agent desktop app. But Google's real signal in agentic coding isn't product polish β it's distribution, model-harness co-training, and the trust bill that a forced upgrade comes with.
Read analysis 2026-06-02 openai
OpenAI's role-specific Codex plugins, hosted Sites, and annotations point to a broader shift from coding assistant to shared work surface.
Read analysis 2026-06-01 openai
OpenAI's models and Codex are now on AWS Bedrock. On the surface it is one more cloud. The real motive is that OpenAI is no longer content to live only inside Microsoft's distribution, and wants to stand on the ground enterprises already know best.
Read analysis 2026-05-14 openai
OpenAI's Codex mobile and remote-host update points to a new workflow: long-running coding agents need remote checkpoints, approvals, and host governance.
Read analysis 2026-04-23 openai
OpenAI's GPT-5.5 release is a signal that frontier models are being judged by long-running execution, tool use, cost, and safeguards, not only raw intelligence.
Read analysis 2026-04-16 anthropic
Anthropic's Opus 4.7 release is less about a single benchmark jump and more about effort levels, verification behavior, and the cost of long-running agent work.
Read analysis 2026-02-17 anthropic
Anthropic's Sonnet 4.6 release matters because it brings near-Opus capability to cheaper, broader workflows while exposing the limits of long context and design polish.
Read analysis 2026-02-05 anthropic
Anthropic's Opus 4.6, 1M context window, and Claude Code agent teams show where multi-agent engineering helps and where cost and coordination still bite.
Read analysis