Elon Musk: At Tesla, we basically had two different chip programs: one Dojo and one. Dojo on the training side, and then what we call AI4, it's just our inference chip

The AI4 is what's currently shipping in all vehicles, and we're finalizing the design of AI5, which will be an immense jump from AI4. By some metrics, the improvement in AI5 will be 40 times better than AI4. So not 40%, 40 times

This is because we work so closely at a very fine-grained level on the AI software and the AI hardware. So we know exactly where the limiting factors are. And so effectively the AI hardware and software teams are co-designing the chip

Compared to the worst limitation on AI4, which is running the SoftMax operation, we currently have to run SoftMax in around 40 steps in emulation mode, whereas that'll just be done in a few steps natively in AI5

AI5 will also be able to easily handle mixed precision models, so you don't have it, it'll dynamically handle mixed precision. There's a bunch of sort of technical stuff that AI5 will do a lot better

In terms of nominal raw compute, it's eight times more compute, about nine times more memory, and roughly five times more memory bandwidth

But because we're addressing some core limitations in AI4, you multiply that 8x compute improvement by another 5x improvement because of optimization at a very fine-grained silicon level of things that are currently suboptimal in AI4, that's where you get the 40x improvement
Поделиться
Исследовать

TwitterXDownload

v1.4.7

Самый быстрый и надёжный загрузчик видео из Twitter. Бесплатно и без регистрации.

© 2024 TwitterXDownload Все права защищены.