SoatDev IT Consulting
SoatDev IT Consulting
  • About us
  • Expertise
  • Services
  • How it works
  • Contact Us
  • News
  • January 16, 2025
  • Rss Fetcher

Many are voicing concern that the world is running out of data and that this will be a blocker to progress toward smarter AI models. One paper in fact projects timelines for when we will run out.

AI researchers are looking for ways to adapt. Nvidia has trained a specific model to generate synthetic data for training other models. Some use this approach, though using AI-generated data to train AI is not without risk.

Others have asked a bigger question, namely, is something fundamentally missing in our approach that relies so heavily on data. Certainly the bitter lesson thesis and the position long advocated by Geoffrey Hinton argue for a data-first approach with “as few” prior assumptions as possible (though every model has a bias).

But it’s currently simply unknown whether just adding more data and compute will do the trick for achieving general intelligence or whether something else is needed. Neurosymbolic approaches are being experimented with, in various forms. But it’s unclear whether these can scale up to the level needed. And the frontier labs, laser-focused on the current paradigm, may not have adequate time or resources to investigate high-risk/high-reward alternatives.

From a theoretical standpoint, sometimes more data is simply not enough. As discussed in a previous post, some problems in mathematics and engineering require exponentially large amount of data to train neural network models. Exponentials can work in your favor, but also can work against you (think of the Tower of Hanoi problem or the Wheat and Chessboard problem). Some problems on certain models cannot be solved by any amount of data available in the entire universe.

The requirements for solving these problems can grow much more quickly than expected. The strength of neural networks, their flexibility, their universal approximation property, can also be a weakness. It can take so much data to nail down all the parameters so that the model is completely error free. Thankfully, many other problems that people want to solve (such as human language modeling) are fundamentally lower dimensional and thus less vulnerable to this problem.

We just don’t know whether the current data-hungry approach will be enough—or whether we’ll need to learn another bitter lesson.

The post Can AI Models Reason: Is Data All You Need? first appeared on John D. Cook.

Previous Post
Next Post

Recent Posts

  • AI dev tools for Windows get a fresh coat of paint
  • Microsoft wants to tap AI to accelerate scientific discovery
  • It’ll soon be free to publish apps to the Microsoft Store
  • NLWeb is Microsoft’s project to bring more chatbots to webpages
  • Devs can now tap Microsoft Edge to power AI web apps

Categories

  • Industry News
  • Programming
  • RSS Fetched Articles
  • Uncategorized

Archives

  • May 2025
  • April 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023

Tap into the power of Microservices, MVC Architecture, Cloud, Containers, UML, and Scrum methodologies to bolster your project planning, execution, and application development processes.

Solutions

  • IT Consultation
  • Agile Transformation
  • Software Development
  • DevOps & CI/CD

Regions Covered

  • Montreal
  • New York
  • Paris
  • Mauritius
  • Abidjan
  • Dakar

Subscribe to Newsletter

Join our monthly newsletter subscribers to get the latest news and insights.

© Copyright 2023. All Rights Reserved by Soatdev IT Consulting Inc.