๐†๐๐“-๐Ÿ’.๐Ÿ ๐š๐ฅ๐ฆ๐จ๐ฌ๐ญ ๐ญ๐จ๐ฉ๐ฌ ๐‚๐ฅ๐š๐ฎ๐๐ž ๐Ÿ‘.๐Ÿ• ๐จ๐ง ๐œ๐จ๐๐ข๐ง๐ ?!

New eval dropping using our #1 SWE-bench coding agent!

- GPT-4.1 beats Gemini 2.5 Pro and almost tops Claude 
   3.7 Sonnet!
- Even GPT-4.1 mini matches Claude 3.5 Sonnet V2 
   performance. It was the top model just 2mo ago!
The evaluation is done through our proprietary codebase understanding benchmark AugmentQA. You can learn more at: https://www.augmentcode.com/blog/you-make-your-evals-then-your-evals-make-you-introducing-augmentqa

Try our agent yourself at: http://www.augmentcode.com.
ะดะตะปะธั‚ัŒัั
ะธััะปะตะดะพะฒะฐั‚ัŒ

TwitterXDownload

v1.4.45

ะกะฐะผั‹ะน ะฑั‹ัั‚ั€ั‹ะน ะธ ะฝะฐะดะตะถะฝั‹ะน ะธะฝัั‚ั€ัƒะผะตะฝั‚ ะดะปั ะทะฐะณั€ัƒะทะบะธ ะฒะธะดะตะพ ะธะท Twitter. ะ‘ะตัะฟะปะฐั‚ะฝะพะต ะธัะฟะพะปัŒะทะพะฒะฐะฝะธะต, ั€ะตะณะธัั‚ั€ะฐั†ะธั ะฝะต ั‚ั€ะตะฑัƒะตั‚ัั.

ยฉ 2024 TwitterXDownload ะ’ัะต ะฟั€ะฐะฒะฐ ะทะฐั‰ะธั‰ะตะฝั‹.