Wall, Holder, Holder Wall: Breaking the Barriers of AI Progress?

Kan Yuenyong
7 min readDec 2, 2024

--

The history of artificial intelligence and computing is one of rapid advancements punctuated by periods of stagnation, often referred to as “AI winters.” These moments, where progress seemed to hit insurmountable barriers, are as much a part of the field’s story as its triumphs. They have historically been overcome not through linear, incremental progress but through paradigm-shifting breakthroughs.

This infographic illustrates the exponential growth of public text datasets (blue line) and LLM parameter sizes (orange dashed line), alongside the theoretical depletion of usable human-generated data, represented by the “wall” (red shaded band) projected to occur between 2028 and 2030. The estimated median stock of data includes 130T tokens from Common Crawl, 510T tokens from the indexed web, and 3100T tokens from the whole web, with additional contributions from images (300T tokens) and video (1350T tokens). While the theoretical total stock of human-generated data is vast, practical constraints such as accessibility, relevance, and quality narrow the usable dataset size. As recent models like LLaMA 3.1 consume up to 15T tokens, the finite growth rate of high-quality human-generated text highlights the “wall” as a critical point where demand for data begins to outpace availability, emphasizing the need for innovations like synthetic datasets to sustain LLM scaling.

Charles Babbage’s mechanical computers of the 19th century were ingenious but fell short due to technological and economic limitations of the time. Gottlob Frege’s Begriffsschrift laid the groundwork for formal logic, but its impact wasn’t fully realized until Alan Turing’s conceptualization of the Universal Machine reframed computation itself. The symbolic AI of the 20th century, with its rigid, rule-based systems, ultimately gave way to deep learning — a biologically inspired, data-driven paradigm that ushered in a new era of AI capabilities.

In each case, progress was not achieved by merely refining existing tools but by stepping beyond their limitations to embrace entirely new paradigms. Today, we face another such “wall,” not solely due to the depletion of data for training but because the current AI development paradigm itself is reaching its limits. Challenges such as the diminishing returns highlighted in scaling law research, persistent alignment hurdles, and the architectural constraints of prevailing models form the core of this barrier. History suggests that overcoming such hurdles demands not brute force but a transformative shift in conceptual thinking.

Karpathy’s Post and the Role of Human Labeling

Andrej Karpathy’s recent reflections on AI training sparked debate. His comparison of large language models (LLMs) to “average data labelers” captures the paradox of AI’s achievements: systems capable of astonishing outputs are, at their core, distillations of human labor and judgment. While this framing may appear simplistic, Karpathy — a former OpenAI founder and Tesla AI director — understands the intricacies of AI training better than most. His statement is not a dismissal of AI’s progress but a critique of its current dependencies.

Karpathy’s post touches on an unspoken tension in AI: the reliance on fine-tuning and reinforcement learning from human feedback (RLHF). While pretraining establishes the general capabilities of models, it is the human-labeled datasets that align them with societal norms and ethical guidelines. This dependency creates a bottleneck — a reliance on labor-intensive processes that may not scale or generalize effectively. The real issue Karpathy’s post hints at is whether this reliance represents a temporary hurdle or a fundamental limitation.

The Wall Behind the Debate: Karpathy vs. Sutskever

Karpathy’s acknowledgment of limitations contrasts sharply with the perspective of Ilya Sutskever, a co-founder of OpenAI and one of the field’s foremost theorists. Sutskever, who has championed scaling laws and foundational breakthroughs, often adopts a cryptic and visionary tone, suggesting that the path forward lies in pushing the boundaries of existing paradigms.

The divergence between these two figures reflects broader questions about the current “wall.” Is it a matter of optimizing existing methods, as Sutskever’s focus on scaling might imply, or does it demand a fundamental shift, as Karpathy’s reflective stance suggests? This tension mirrors historical divides: symbolic AI vs. neural networks, deterministic systems vs. probabilistic approaches. Both Karpathy and Sutskever recognize the wall, but their approaches to overcoming it diverge sharply.

Categorizing the Camps in AI

The debate between Karpathy and Sutskever is part of a broader spectrum of perspectives in AI research. The field today can be loosely divided into five camps:

  1. Optimistic AGI Advocacy: Figures like Sam Altman project confidence in scaling and emergent capabilities, emphasizing incremental progress toward AGI while downplaying barriers.
  2. Reflective Pragmatists: Researchers like Karpathy grapple with AI’s limitations, seeking practical solutions while acknowledging the need for paradigm shifts.
  3. Theoretical Visionaries: Leaders like Sutskever focus on foundational breakthroughs, exploring scaling and novel theories to transcend current constraints.
  4. Realist Innovators: Yann LeCun represents this camp, emphasizing the need for new architectures and biologically inspired systems that address AI’s fundamental gaps.
  5. Human-Centered AI: Figures like Fei-Fei Li prioritize ethical alignment and interdisciplinary collaboration, ensuring AI serves humanity effectively and responsibly.

Each camp reflects a unique approach to the wall, ranging from scaling optimism to calls for revolutionary change.

Who Will Break the Wall? Lessons from History

History suggests that the wall will not be broken by incrementalism or brute force but by a conceptual shift. Figures like Yann LeCun and Ilya Sutskever are well-positioned to lead this effort:

  • Yann LeCun: His focus on world models and biologically inspired systems mirrors the paradigm-shifting nature of past breakthroughs. By rethinking AI architectures, LeCun aligns with historical innovators who challenged foundational assumptions.
  • Ilya Sutskever: Sutskever’s theoretical depth and experience in foundational research make him a strong candidate to explore scaling’s untapped potential. If he can embrace novel paradigms beyond tokenization and vectorization, he could lead a transformative leap.

Others, like Altman and Karpathy, play critical supporting roles. Altman provides the resources and momentum needed for large-scale experimentation, while Karpathy’s emphasis on practical applications ensures that breakthroughs are accessible and impactful. Fei-Fei Li’s interdisciplinary approach could inspire new methods for learning and interaction, enriching the collective effort.

Conclusion: Toward a Paradigm Shift

The current AI wall is not insurmountable — it is an invitation to rethink the field’s foundations. History reminds us that progress is rarely linear; it requires bold thinkers willing to challenge conventions and explore uncharted territory. Yann LeCun’s call for new architectures and Ilya Sutskever’s theoretical vision offer the best hope for breaking through, while leaders like Altman, Karpathy, and Li provide the infrastructure and interdisciplinary insights necessary to sustain the effort. The next breakthrough in AI will not come from any one figure or camp but from the synthesis of their diverse perspectives — a collective leap toward a new paradigm.

Site Note 1: Data Constraints and Scarcity

The paper Will we run out of data? Limits of LLM scaling based on human-generated data” (Villalobos et al., 2022) explores the finite availability of human-generated text data as a critical limitation to the scaling of large language models (LLMs). The authors project that the datasets required for training these models could exhaust the total stock of public human-generated text data between 2026 and 2032, emphasizing the growing demand outpacing available supply. This creates a bottleneck that could hinder further advancements in AI.

The paper proposes strategies to address these challenges, emphasizing synthetic data generation to create artificial datasets that augment training data, transfer learning to apply knowledge from data-rich domains to data-scarce areas, and data efficiency improvements to design models capable of robust performance with fewer data requirements.

This work provides a quantitative and strategic foundation for discussions about the impending “data wall,” framing it as a significant obstacle that must be addressed to sustain progress in LLM development.

Site Note 2: Shifts in AI Scaling Laws

Maxwell Zeff’s article, Current AI scaling laws are showing diminishing returns, forcing AI labs to change course” (TechCrunch, 2024), captures a critical inflection point in AI development. Traditional scaling laws, which drove advancements by expanding compute and data during pretraining, are now exhibiting diminishing returns. This trend reflects the saturation of improvements that such methods can achieve, necessitating new paradigms.

The article highlights a promising shift toward test-time compute, a technique that increases computational resources during inference rather than pretraining. This allows models to “think” more deeply and iteratively about questions, improving reasoning and problem-solving capabilities. OpenAI’s o1 series exemplifies this approach, where models simulate step-by-step thought processes to tackle complex tasks.

Key insights include:

  • Paradigm Shift: The second era of scaling laws emphasizes inference-time optimization, moving beyond pretraining-centric methods.
  • Challenges and Opportunities: Test-time compute could demand specialized hardware and longer inference times, but it holds potential to extend the capabilities of existing architectures.
  • Incremental Innovations: Beyond test-time compute, application-level enhancements like UX innovations and better contextual tools are also recognized as drivers of significant performance gains.

This shift underscores the broader adaptability of AI labs, signaling a pivot toward exploring post-training methodologies as the next frontier in AI scalability.

Site Note 3: Breaking the AI Wall — Challenges and Pathways

Proposals to overcome the “AI wall,” such as synthetic data generation and test-time compute, offer promising avenues for advancement, yet they are fraught with significant practical challenges. These challenges must be confronted head-on if we are to transform ambitious ideas into tangible progress. The “AI wall” is not merely a technical limitation but a multifaceted barrier that requires rethinking established paradigms and embracing solutions that are both innovative and grounded in practicality.

Synthetic data generation, for instance, presents a compelling way to address data scarcity but is burdened by issues of bias, data quality, and economic scalability. Current methods remain computationally intensive, and the ethical implications of transparency and representativeness cannot be overlooked. Similarly, test-time compute introduces opportunities to enhance reasoning capabilities during inference by allocating additional resources, yet it is hindered by high energy costs, hardware dependencies, and latency risks. These factors limit its scalability and practicality for widespread adoption.

To overcome these obstacles, we must acknowledge the interdisciplinary nature of the “AI wall.” Breaking through requires more than engineering advancements; it demands a collaborative effort across fields to integrate technological, ethical, and societal perspectives. The solutions we pursue must not only address feasibility but also align with broader societal goals, ensuring that progress remains sustainable and equitable.

This is our latest research addressing the challenges posed by the ‘AI wall.’ In this work, we propose three innovative approaches to overcome these barriers: Test-Time Compute, Test-Time Training, and Emotional/Creative AI Personas. By leveraging dynamic frameworks, we explore how these methods can enhance adaptability, reasoning, and cognitive integration in AI systems. This research aims to provide a roadmap for future AI advancements in tackling the limitations of scaling laws, data scarcity, and architectural constraints. See the paper here.

The “AI wall” is neither an unconquerable force nor a challenge to be underestimated. It calls for humility in understanding its complexity and boldness in pursuing its resolution. Breaking through this barrier demands a shift from brute-force scaling, which has reached diminishing returns, to new paradigms that redefine AI development. It requires deliberate investment in foundational breakthroughs that expand the boundaries of what is possible while embracing the inherent complexity of this challenge. By confronting these barriers with clarity, focus, and innovation, we can ensure that the breakthroughs we achieve are transformative and enduring.

./end./

--

--

Kan Yuenyong
Kan Yuenyong

Written by Kan Yuenyong

A geopolitical strategist who lives where a fine narrow line amongst a collision of civilizations exists.

No responses yet