Are LLMs an over-saturated field ?

Hello everyone! I found this incredibly interesting video from Machine Learning Street Talk featuring Llion Jones, the co-inventor of the Transformer.

He basically says the Transformer architecture is "oversaturated" and we need to move on!

He and his team at Sakana AI are introducing a new architecture called the Continuous Thought Machine (CTM). It's designed to mimic human thinking by:

  1. Thinking Step-by-Step: It has an internal "thought dimension" that lets it process problems sequentially, rather than solving them in one go.

  2. Adaptive Compute: It naturally learns to use less "thinking time" for easy tasks and more for hard ones.

If you're interested in the future of AI beyond LLMs, definitely give this a watch. It might be where the next big leap comes from!

Video Link: https://www.youtube.com/watch?v=DtePicx_kFY&t=95s

My thoughts- I agree that presently AI research and application is focused too much on LLMs and achieving minor tweaks that lead to minor performance improvements. I believe there are still some other fields AI less explored especially in the Vision side. If we treat transformer architecture as some other ML model and try invent another model, we can go a long way in AI research.

Question for Discussion: Do you agree that the focus on only scaling up Transformers has hit a wall? What architectural change do you think is most needed in AI right now?

1