Read About: AI by Tariq Daouda | Bluwr.

11: AI development has reached a limit and it is not hardware 8835

Tariq Daouda, 2 mon.

- World Politics - Technology - Science -

1.2 min read.

There is a shortage of GPUs, there is a shortage of RAM, there is a shortage of electricity. Still, none of the above is the real limiting factor: it's a skill and research issue. For more than a decade now, the AI world has been dominated by an open-source arms race whose effect has been a near total focus on engineering to the detriment of research and meaningful developments. The result has been over engineered proof-of-concepts, chief amongst them being Transformers. The original paper mostly demonstrated that if you put attention over everything, and several of them, you can beat LSTMs. Is it a surprising result, not so much. This is somewhat morally similar to Res-nets, that showed that the more you connect layers the better the results. That's also not very surprising. Both significantly increased the size of models. These are mostly engineering innovations. Although they did open interesting theoretical questions, they did not come from strong theoretical foundations. They come from trial and errors copy-pasting existing technologies and connecting them in new ways. And then, these technologies got themselves copy-pasted and reconnected. Fast forward today we have massive behemoths that are draining the computational ressources of the world. Even AI curricula followed this trend. Today, most only very quickly skim over the mathematical and theoretical foundations. Focusing more and more on building pieces of increasing complexity while dodging explanations of their inner workings. This has culminated in today's "AI builders" trend, where fully trained LLM assembly lines are stringed together. Here is the true limitation of AI. This mindset has been pushed so far that we have reached a physical limit. Now we can either build a much bigger Nvidia, produce a 100X more RAM, lower the price of KW/h to unseen levels. Or, go back to the theory and design models that are more optimal. Optimal not because they are distilled, not because they use lower precision, but because they don't rely on Transformers, nor diffusion, or any of the very costly paradigms currently use, in the shape and form are currently used. Just like physical computers have been shrinked to sit in the palm of your hand. Immaterial AI models can also be made smaller.