Rankings March 23, 2026

State
of AI

by Ben Davis

DISCLAIMER — this is all just my opinion, based on my experiences and what i've used. it is impossible to try everything at the level of depth i would like to, so i've decided to simply focus this site on the tools that i am using the most everyday

01

Models

01

GPT 5.4

defaultcodingcomputer use

My default model for coding.

+
  • Incredible instruction following
  • Recent training data
  • Tool calling and running for a long time in a loop
  • Coding
  • Computer & Browser Use
  • Awful at UI
02

Opus 4.6

uidesigncoding

The model I use for making UIs and doing frontend work

+
  • Great at tool calling
  • Feels really fast (prone to action)
  • Pleasant to talk to
  • Excellent at UI and frontend
  • Lacks the "wisdom" of 5.4, resulting in much worse code over time
  • Expensive
03

GPT 5.4 Mini

searchsubagentscomputer use

A sleeper pick. The speed is absurd, incredible for "sub agent" type tasks (search with tools)

+
  • Excellent tool calling
  • Very very fast
  • Very cheap
  • Lacks the overall intelligence of 5.4
04

Gemini 3.1 Pro (and flash kinda)

intelligentgoogle model

Do you want to process a video, image, or turn a giant blob of text into JSON? This is the model (or flash) for it. It's also pretty good at UI.

+
  • Probably the best multi-modal model right now (video, image, etc.)
  • Turning messy input into structured JSON
  • Pretty good at UI
  • Awful at tool calling and working in an agent harness
05

Another sleeper pick I didn't think I would use as much as I do. Anytime I need a quick change/search it's perfect. Also surprisingly good at UI stuff.

+
  • Stupid fast
  • Cheap
  • Great at tool calling
  • Pretty smart and capable
  • Can only be used in Cursor harnesses
  • Not as powerful as the frontier from OpenAI and Anthropic
02

Harnesses

01

T3 Code

desktop app

Yes I'm biased, but T3 Code is really good. 90% of my work is done from here, with the only real exception being frontend stuff (typically done in Cursor, Opencode, or Pi)

+
  • Very very fast
  • Doesn't lag and overheat my computer
  • Wraps codex and other existing harnesses so it's A) free and B) giving u the best experience for the models
  • Currently only support codex and claude code, thus limited model selection
02

Cursor

editorcloud agentdesktop appcli

Still my default editor.

+
  • Editor is vscode based, it's what u expect it works
  • They have the only good cloud agents I've tried
  • Every model is usable
  • General stability of the product/platform is rough...
  • Have to use a cursor sub for the inference
  • Glass the new desktop ui is very early and has a ton of issues
03

Pi

clisdk

The tui is a tui. Idk what else to say but: it's good.

The really impressive part is the SDK. Best "agent" SDK I've tried by a lot.

+
  • Fast and minimal tui
  • Incredible SDK
  • Can use any model*
  • Auth is a little weird with the tui
04

Opencode

clidesktop appsdk

Still the most impressive overall tui and their desktop app has come a lot way to actually being quite solid. I just kinda like having it around for random tasks.

+
  • Looks and feels great to use
  • Solid desktop app
  • Excellent client/server architecture making it super extensible
  • SDK is pretty bad compared to pi. It just does too much
  • TUI isn't a normal TUI, it fully takes over the terminal view which is great most of the time, but can cause weird issues
05

Codex

clidesktop appcloud agentsdk

The best harness for OpenAI models. CLI is great. Desktop app is beautiful, but has a ton of perf issues. I mostly just use it under the hood through T3 Code.

+
  • Desktop app looks beautiful
  • CLI is minimal and performant
  • Excellent harness for the current best models
  • Desktop performance is rough
06

It's in a rough state

  • Not something I reach for
03

Subscriptions

01

Cursor Ultra

$200every model

Your $200 will get you ~$400-$500 a month in inference. Not nearly as insane of a subsidy as the major labs, but still worth having in my opinion.

If you have to pick only one sub, this is the one to get.

+
  • You get a solid editor, the only good cloud agents, and a fine cli.
  • Model variety
  • Inference subsidy isn't as crazy as the big labs'
02

OpenAI Codex

$200gpt only

Insane value, your $200 will get you thousands of dollars worth of GPT a month if you push it.

If you can afford two subs, this is the other one to get. GPT models are overall the best right now, and OpenAI is super chill about letting you use the sub in other harnesses:

  • Codex
  • T3 Code
  • Opencode
  • Pi
+
  • Insane value for the money
  • Harness flexibility
  • GPT models only
03

OpenCode Black

$200every model

This is the third sub I'd get if you can spend the money. Not as insane of a value as the other subs, but it seems like you can pretty easily get close to $1000 if you push it (although that could change).

Best part of this one is how flexible Opencode Zen is. It's truly just an API, your sub gets you an API key. Which means it's trivial to use as the inference for agents your building, in other harnesses, etc.

+
  • Opencode Zen is basically an API so you can use it anywhere
  • Not as absurd of a value
  • Heavily wait listed right now, you might not even be able to get it
04

Claude Max

$200claude only

A pretty insane value for the money if you push it hard. Your $200 will get you thousands of dollars a month in inference if you push it.

The problem is this is the most locked down sub of them all. You can really only use it in Claude Code which is currently the weakest mainline harness

+
  • Huge inference value if you hammer it
  • Locked to Claude Code

Snapshot for this period, not a live ranking.

GitHub