https://t.co/KFxhcvvNCP

My prediction is based purely on my read on the psychology behind the hype from Sam and the OAI team
So far it feels a LOT better than 4o but difficult to tell where it lies relative to Claude 4
I'd guess probably practically worse than Claude 4, it made a mistake there that I know Claude 4 make
Oh, it made another absolutely idiotic mistake
I gave it huggingface urls for a model download and it inexplicably changed the base urls to another repo
What??!
It's free in Cursor but free isn't worth worrying about shit like that
I will jump at the opportunity to replace Claude but it feels like this ain't it, losing confidence rapidly
And the two mistakes it made are very unpredicable, at least when cursor fails you can tell why - skill issue mostly
This is similar to O3/Grok4 fails, they just tend to be absolutely unhinged